BIO 310 lab kurs

advertisement
Compendium
autumn 2000
BI 315 lab. course
Methods in population genetics
Department of Botany
&
Trondhjem Biological Station
(Inst. of Natural History, VM, NTNU)
J. Mork (Ed.)
co-workers:
M. Heun, B. I. Honne, S. Karlsson, T. Ryan, S. Såstad, M.-A. Østensen
2
CONTENTS
(T = theory, X = experiment)
Page
--------------------------------------------------------------------------Lecturers ............................................................
3
(T)
Protein electrophoresis (J. Mork)................................
4
(T)
Genetic interpretation of banding patterns on gels (J. Mork).....
9
(X)
Isoelectric focusing of LDH in gadoids (J. Mork) ................
13
(T)
Analysis of genetic differentiation and structure (J. Mork)......
14
(X)
Starch gel electrophoresis of fish tissue enzymes (M-A. Østensen)
18
(T/X)
Bruk av isozyner for å studere hybridsoner
(diploid-tetraploid hybridsone hos orkidéer) (S. Såstad) ......
(T/X)
RFLP markers (the cDNA RFLP SypI*)
(T/X)
DNA markers (mini/micro-satellites,PCR reaction) (T. Ryan) ....
APPENDICES
(S. Karlsson) ..............
23
28
32
..........................................................
40
(T) Hints on software for statistical tests and genetics (J. Mork)....
41
(T) Measurements of similarities and distances (B.I. Honne) ..........
45
(T) Analysis of data from Avena sterilis RAPDs (B.I. Honne) ..........
47
(T) DNA analysis techniques (P. Galvin)...............................
56
(T) Plant DNA markers (M. Heun).......................................
78
2
3
BI 315
POPULATION GENETICS METHODOLOGY COURSE
AT TBS AUTUMN 2000 (WEEK 47 & 48)
Prof. Mork and Prof. Fenster are responsible for the course.
Personnel involved:
Name
Telephone
Telefax
E-mail
Prof. Jarle Mork, TBS
Prof. Charles Fenster, Bot.Inst., KB-Fak.
Prof. Manfred Heun, NLH Ås
Prof. Bjørn Ivar Honne, Planteforsk
Dr. Tony Ryan, Max Planck, Leipzig
Dr. Sigurd Såstad, Bot. Avd. VM
Cand. Scient. Sten Karlsson, TBS
Leading Eng. Mari-Ann Østensen, TBS
47 73 59 15 89
47 73 55 0337
47 64 94 76 91
47 74 82 62 11
49 341 9952 593
47 73 59 22 51
47 73 59 15 80
47 73 59 67 99
47 73 59 15 97
47 73 59 61 00
47 64 94 76 79
47 74 82 88 11
49 341 9952 555
47 73 59 22 49
47 73 59 15 97
47 73 59 15 97
Jarle.Mork@vm.ntnu.no
Charles.fenster@chembio.ntnu.no
manfred.heun@ikb.nlh.no
Bjorn.Ivar.Honne@neplanteforsk.nlh.no
Ryan@eva.mpg.de
Sigurd.Sastad@vm.ntnu.no
Stenka@stud.ntnu.no
Mari-Ann.Ostensen@vm.ntnu.no
Web address for course information:
http://www.ntnu.no/~jmork/jmork/courses/315H00/AGEN00.html
3
4
Lecture
PROTEIN ELECTROPHORESIS
(J. Mork, TBS)
Principle
In an electric field (DC), charged particles like molecules in aquous solution migrate towards the electrode of
opposite charge. Amphotheric molecules (e.g., proteins and peptides) may have a large number of charged
groups, and their net charge will depend on the pK value (the dissociation constant) of their charged groups
which depends on the pH of the aquous medium. Due to differences in charge, different molecules in a mixture
will migrate with different velocities and thereby be separated in single fractions. In addition to pI (the isoelectric
point) of a protein/peptide, its electrophoretic migration velocity is influenced by the type, concentration and pH
of the buffer, by the temperature and field strength (the voltage between the electrodes), as well as of type and
pore size of the stabilizing medium (paper, agar, starch etc).
Allelic variation (substitution of amino acids) in proteins usually does not affect molecular size appreciably.
However many such substitutions result in a change of net charge which alters the electrophoretic mobility and
makes the different genotypes detectable by electrophoresis. Of special value for population genetics is that such
’biochemical’ variation is co-dominant and allows the scoring of both allels at a locus (i.e. no dominance or
recessivity).
Electrophoretic separations can take place in free solution (e.g. in capillars) or in stabilizing media such as silica
plates, variuos paper types, or gels. The development of stabilizing media during the last 50 years has been from
paper via agar, cellolose-acetat, agarose, starch and to synthetic polymers of acrylamid. At the same time there
has been a development of new techniques from the ‘continuous’ separation based on charge, via separation
based on molecule size, to disc electrophoresis, immuno-electrophoresis and isoelectric focusing.
No other biochemical technique has shown such a rich diversification and played such a central role in modern
biochemistry. By electrophoresis it is possible to obtain very efficient separations with relatively simple
equipment. Application areas range from biological and biochemical research to protein chemistry,
pharmacology, forensic medicin, veterinary science, food quality control, molecular biology and genetics.
Samples may be as diverse as whole cells or particles, proteins, peptides, amino acids , organic acids and bases,
nucleic acid, drugs, and pesticides - in short, all substances that can carry electric charges.
In biological research it will probably become increasingly important to choose the most adequate separation
technique for a specific purpose, and to be able to carry out the practical procedures involved in electrophoresis.
An thorough guide to the techniques is Westermeier (1993).
Basically, there are three different principles for electrophoretic separation:
a) common zone electrophoresis b) isotachophoresis (ITP) c) Isoelectric focusing (IEF)
Similarities and differences between these three is shown in the following figure (mr = relative mobility (to a
standard), pK = the dissociation constant, T = trailing ion, L = leading ion, and pI = isoelectric point (i.e., the
pH where the amphoteric compound has no net charge).
4
5
a) In electrophoresis we use a buffer system that is homogeneous over the entire separation area to ensure equal
pH. This is valid also for disc electrophoresis, although there the buffer system is discontinuous in the start of
the experiment in order to concentrate the substances in a very narrow start band (i.e., utilizing the
isotachophoresis effect).
b) In isotachophoresis (ITP) the separation takes place in a discontinuous buffer system. The ionized compound
migrates trapped between a front ‘leading electrolute’ and a tail ‘trailing electrolyte’ which migrates with the
same velocity. The various components of the sample distribute themselves according to their respective
electrophoretic mobilities and form a ‘stack’ with the front bands closely behind the ‘leading ion’ and the tail
bands just in front of the ‘trailing ion’. Isotachophoresis is mostly used in quantitative separations (and as a
‘stacking and concentration’ step in disc electrophoresis).
c) In isoelectric focusing the separation takes place in a pH gradient created by several hundred different
ampholytes with different isoelectric points and with buffer capacity at their isoelectric points. As anolyte
and catholyte are used e.g. 1 M phosporic acid and 1 M sodium hydroxide. The function of these is to keep
the gradient ‘in place’ between the electrodes. IEF is very sensitive to electro-osmosis (see below), and the
supporting medium should thus be as electrically inert (usual media are polyacrylamide and specifically pure
agarose). IEF is suitable for amphoteric substances in which the net charge depends on pH, e.g. proteins and
peptides. The molecules migrate to the position in the pH gradient where their net charge is zero (i.e., their
isoelectric point) and the mobility is zero. Should they diffuse away from this position, the buffer effect of the
nearby ampholytes will induce a charge which will force them back in position between ampholytes with
slightly lower and slightly higher pI. The higher the field strength (voltage drop) is, the more concentrated
the bands will be, thereof the name ‘focusing’. IEF is mostly used for qualitative characterization of
substances or mixtures of substances and purity control, but also for preparative purposes. The pH gradient
gels can easily be made inhouse, but are also commersially available as ready-made gels with different
gradients (e.g., 2-10, 3-9, 4-9, 4-6, 5-7, 5-8 etc). Ampholyte mixtures are marketed by many firms (e.g.,
Pharmacia’s «Pharmalytes» , Serva’ «Serva-Lytes», Bio-Rad’s «Biolytes» (the two latter are identical).
Ready-made gels cost much more than home-made (~600 kr vs ~80 kr per gel of size 24,5x12,5x0,1 cm).
Both home-made and ready-made gels have fridge shelf lives of up to one year. The chemical composition
(ampholyte type etc) as well as the linearity of the pH gradient vary considerably between brands. There is
also some variation in prices.
The buffer system in electrophoresis
Common electrophoresis takes place in a buffer with accurate pH and constant ionic strength. The ionic strength
should be as low as practically possible in order to achieve high field strength (voltage drop) and thereby rapid
migration/separation, but not so low that the proteins are not pH-buffered by the medium, or the buffer capacity
is used up before the separation is completed. During electrophoresis, the buffer ions migrate through the gel in
the same manner as the sample molecules; anions towards the anode and cations towards the cathode (in vertical
5
6
electrophoresis the buffer pH is set so that all molecules of interest migrates towards the same electrode; in
practice from the upper to the lower part of the gel).
The buffer ions are responsible for most of the conductivity in the supporting medium. The lower the
conductivity (i.e. the ionic strength) , the less Joule heat is produced, and the higher field strength can be
employed without overloading the cooling capacity of the system. The cooling is usually achieved by a cooling
plate connected to a circulating thermostat. The buffer capacity must be large enough to ensure constant pH
during the entire experiment. The capacity is regulated by the amount of buffer and/or its concentration.
The problem of electro-osmosis
If the gel support (glass plate, plastic film etc) or the separating medium itself have electrical charges, a
phenomenon called electro-osmosis occurs. If the charges are negative, water in the buffer will migrate towards
the cathode and carry sample molecules with them (socalled cathodic drfit). This can either counteract or
increase the ordinary electrophoretic mobility. The high voltages employed makes IEF particularly sensitive for
electro-osmosis, not least by the use of media which are not totally electrically inert (like some brands of «IEFgrade» agaroses). However, the pheomenon is also common in ordinary electrophoresis in agar, paper and
cellulose-acetat. Gels made from starch and polyacrylamide have no electro-osmosis.
Joule heat and the cooling system
After separation, the bands should be as distinct and concentrated as possible. Prolonged analysis time will
usually lead to unwanted band diffusion. One way to shorten the analysis time (i.e., diffusion time) is to increase
the field strength (i.e the applied voltage) over the gel. However, this will also increase the Joule heat produced
(cf figure below), and this may lead to problems like protein denaturation, gel artifacts (melting agarose),
«smearing» of bands, etc. It is therefore very important to design the experiment so that the separation takes place
as quickly as possible, but with no more Joule heat produced than can be carried away by the cooling system.
Very basic knowledge about the aparatus and to the relations between voltage, current, conductivity, affect and
Joule heat makes it relatively easy to avoid problems of this kind. It is the total applied effect (measured in
Watts) that determines the heat production in the system that must be matched by the cooling plate capacity. The
current is necessarily the same at any point between the electrodes. In places where the resistence is large
(conductivity low), either because the cross-section of the circuit «lead» is small or because there are few ions
present, the system will «use up» most of the voltage to «force» the current through. With constant current and
high field strength these circuit parts will use more of the available wattage and therefore produce more Joule
heat than other parts. Typically, this is in the gel, which often has both a smaller cross-section and a lower
conductivity than the buffer chamber. Therefore, the cooling plate is placed under the gel. Cooling plates made
of metal (NB! must be electrically insulated!) or ceramics are much more efficient than those of glass, and will
allow higher effect and hence shorter analysis time. As a rule of thumb, a 1 mm thick gel on a metall or ceramic
cooling plates can tolerate an applied wattage of 0.2 W/cm2 gel without substantial temperature increase (i.e., not
more than 2-3 degrees centigrade higher in the gel than in the coolant).
This Joule heat produced is directly dependent on the effect applied to the electrophoresis system. The effect
obeys the following simple equiation:
Effect (watt) = Voltage (volt) x current (ampere)
6
7
Tissue samples; properties and treatment of proteins
An important criterion for the choise of electrophoretic method is the type of sample which is to be analysed. One
line can be drawn between denaturing methods (e.g. SDS electrophoresis) and methods where the biological
activity of the protein must be preserved. Another line is between amphoteric compounds (proteins, peptides)
and non-amphoteric substances. Common to them all is that the sample should not contain particles, oil drops etc
because these may block the pores of the medium.
Protein extracts are usually prepared by homogenization in aquous solutions (aqua. dest. or buffer). Since e.g.
enzyme loci may be differently manifested in different tissue types, it can often be useful and efficient to
homogenize several tissue types together (e.g. muscle and liver) in the same vial in order to have more loci
represented. Usually, a few seconds of forceful mincing of the tissue samples (1 ccm in double amount liquid)
with a glass rod is sufficient tot break the cell walls and release the proteins in animal tissues (plant tissues may
need more labour). It is usually desirable to centrifuge the homogenates (e.g. 10.000 G for 10 minutes) to avoid
cell debris in the extracts which may block the pores of the medium.
Some proteins are very tough and can stand rough treatment in the field as well as in the laboratotium, while
others are extremely sensitive for factors like elevated temperatures, oxydation, low ionic strength, too high or
too low pH (low pH is usually worse than high). The properties of different proteins must be learned by
experience in each organism and each organ.
However, there are a few general rules. For example, proteins (e.g. enzymes) which usually perform their
function at relatively high temperatures will better tolerate high temperature and storage in the laboratory. Thus,
mammalian proteins are usually more stable at room temperature than proteins from e.g. fish.
In any case, the best results are usually obtained when using fresh (not frozen) samples.
Bacterial degradation can be a serious problem. It is adviceable to strive for as sterile treatment as possible
during all stages of sample preparation, to keep the samples chilled, and to avoid drying-out as well as to much
sample dilution. In some cases, the use of a bacteriostat like Na-azid can be necessary to avoid bacterial growth.
In addition, the pH of the extraction buffer should not be too far from the natural milieu of the protein since
physiological conditions will usually increase its life-time. If samples are to be stored for prolonged periods (e.g.
more than 1-2 weeks) this should take place at ultra-low temperatures (e.g., at -70 degrees C or lower) in a «biofreezer», on dry ice, or in liquid nitrogen, and packed in a way which avoids drying-out and exposure to air
oxygen. One should be aware, however, that some proteins will not tolerate the freezing/thawing process. In such
cases freeze-drying may be an alternative.
7
8
Detection of inherited variation in proteins
Mutations are the main source of inherited protein variation. By point mutations the DNA polymerase have
performed an erroneous reading which results in the incorporation of the «wrong» amino acid in the protein.
Between one third and one fourth of the possible amino acid substitutions will lead to a difference in charge
bewteen the original protein and the mutation, so that they can be detected by differences in electrophoretic
mobility. Proetin electrophoresis is therefore a technically simple and suitable method for detecting inherited
variation, although not all amino acid substitutions can be detected.
Most proteins are colourless and will not be visible in the gel without specific histochemical colouring
procedures. For general protein staining there are several more or less sensitive methods. Many of the stains were
originally developed by the textile industry (e.g. the wideliy used Coomassie Brilliant Blue). In recent years more
sensitive techniques (e.g., «silver staining») have been developed. Commercial kits are available for silver
staining, but the technique is also thoroughly described and can easily be adopted from Westermeier (1993).
Except for some procedures for the detection of e.g. lipo-proteins and gluco-proteins, general protein stains are
unspecific. Enzyme stainings, on the other hand, can be made very specific by basing the staining procedures on
reactions that only can take place in the presence of a specific enzyme. The principle is to incubate the gel in a
solution (or covering the gel with an agar/agarose overlay) containing the enzymes’ substrate as well as the
necessary co-factors (like NAD, NADP etc) for the reaction, plus reagents which result in the precipitation of a
coloured product (e.g. formazan) at the site of enzyme activity.
Documentation of results, preservation of gels
The colour reaction is stopped when the banding patterns are scorable, usually by incubating the gel in fixing
solution. Widely used fixing solutions are, e.g., 20% TCA, 10% picric acid, and 1:4:5 mixture of acetic
acid/water/ethanol. By fixing the protein unfolds, gets trapped in the gel matrix, and looses its
biological/enzymatic activity (note that formazan-bands from MTT are soluble in ethanol, so that when alcoholic
fixatives are used, NBT rather than MTT should be used). In large pore gels, small molecules may not be
adequately trapped by the fixing and may diffuse out of the gel. One solution to this can be to reduce the pore
size quickly after separation by drying the gel.
Gels of starch and cellulose-acetat can be frozen and will keep their integrity on thawing. A frozen agar or
agarose, gel, however, will collapse on thawing. Both agarose and polyacrylamide gels may loosen from plastic
support films upon thawing. Recommended preservation methods for the different types of gels are (note that
photodocumentation or digitizing with a scanner is an option in all cases):
 Cellulose-acetat: Drying.
 Agar/agarose: Drying-in onto the polypropylene support film (drying can be speeded up with a hair-dryer).
 Starch: Freezing, or drying-in onto filter paper in a «gel-dryer».
 Polyacrylamide: drying in on the polypropylene support film, and covering with an extra plastic film.
8
9
Lecture:
GENETIC INTERPREPATION OF BANDING PATTERNS
ON GELS
(J. Mork, TBS)
There will usually be individual variation in the banding patterns after staining. The variation may be phenotypic
or genotypic. The phenotypic variation can be caused by post-transcriptional changes to the protein like partial
degradation, glucosilation, polymerization etc, and is usually of limited value for the purpose of studying genetic
variation.
For the variation in banding patterns to be decribed as genetic, certain assumptions must be fulfilled which are
based on the Mendelian laws of inheritance, the Hardy-Weinberg theorem, plus combinatorics and knowledge on
the quarternary structure of each protein (for protein substructure see e.g. Darnall & Klotz 1975).
Ideally, the heritability of protein variants should be checked in offspring groups from controlled crossings of
parents with known ‘genotype’. In lack of or in anticipation of such data, the observed ‘genotypic’ distribution in
an adequately large number of individuals may be tested against Hardy-Weinberg expectations. For this purpose
one should of course use samples which from other (e.g. biological) criteria appear to be representing one single,
panmictic population. (See section on ”The Hardy-Weinberg theorem...” below). The test procedure applied
for this purpose is the chi-square goodness-of-fit test, which is carried out as in the following hypothetical
example:
Suppose that by visual inspection of banding patterns among 100 diploid individuals, three different patterns are
found; either one or the other of two dense bands with different positions on the gel, or both those bands but with
only half the density in each band. The three types occur with the numbers 34, 15, and 51, respectively. We
hypothesize that the bands are caused by a two-allele (A and B) polymorphism at one locus, and assign the
genotypes AA, BB, and AB to these three banding patterns. The protein thus appears to be a monomer (see
chapter ”Banding patterns...” below), with one gene product (gel band) in the homozygotes and two in the
heterozygote. We want to test whether the observed distribution of our ‘genotypes’ is in accordance with this
interpretation, and carry out the chi-square ”Goodness-of-fit” test:
H0: The sample is taken from a population where AA, AB, og BB are distributed according to Hardy-Weinberg
equilibrium proportions.
H1: The sample proportions of AA, AB, og BB deviates too much from H-W equilibrium that H0 can be correct.
AA
AB
BB
N
qA
Oberved
34
52
14
100
.6
Expected (HW)
36
48
16
chi-square
(34-36)2/36
(52-48)2/48
(14-16)2/16
Pooled chi-square = 0.11 + 0.33 + 0.25 = 0.69. Degrees of freedom (DF) = 3 -2 = 1
P (probability of worse fit) = 0.406
Conclusion: H0 is not rejected.
qB
.4
That the distribution of AA, AB, og BB is in accordance with the Hardy-Weinberg expectations can be taken as
substantial support for a hypothesis that the variation is heritable and is caused by allelic variation at one locus.
(Of course, the ultimate test would have to be based on controlled crossing of parents with known genotype).
Banding patterns caused by different quarternary structures of the protein
The example above concerned the simplest possible situation - a monomeric protein where the homozygote
pattern is one-banded and the heterozygote is two-banded. In cases where the alleles code for sub-units of
composite proteins (dimers, trimers, tetramers etc), more complex heterozygote patterns will be seen on the gel.
9
10
Dimeric proteins:
Three combinations of sub-units X and Y are possible (XX, XY, and YY). The heterozygote will show one Xband, one intermediate XY-band, and one Y-band. The relative amounts (and staining intensity) of the three
types will, according to simple combinatorics, be 1:2:1.
Trimeric proteins:
Four possible combinations: XXX, XXY, XYY, YYY with expected intensity 1:3:3:1.
Tetrameric proteins:
Five possible combinations: XXXX, XXXY, XXYY, XYYY, YYYY with expected intensities 1:4:6:4:1.
Interlocus hybrid bands
When several loci code for sub-units of composite proteins one will often (but not always) find molecules which
are composed of sub-units from different loci. In the simplest case; two monomorphic loci for a dimeric protein,
this may be manifested as one hybrid zone in the middle between the two homodimeric bands of the two loci. The
banding pattern will be substantially more complex when looking at multiple loci coding for sub-units of
tetrametic proteins. In general I, the theoretically expected number of bands will be (Harris & Hopkinson 1977):
I = (L + h + n -1)! / n!(L + h -1)!
Where L=number of loci, h=number of heterozygous loci per individual, and n=number of sub-units per protein.
For example, a double heterozygote for LDH-2* og LDH-3* in cod will be expected to show:
I = [(2+2+4-1)!/ (4!(2+2-1)!)] = 35 bands. For a triple heterozygote the number would be 126.
Inter-specific hybrid patterns
Species hybrids usually show the protein bands from both parent species. Since even closely related species
rarely have the same alleles at a locus, electrophoresis is a very efficient method both for the identification of
species and the detecting species hybrids. Particularly, this applies to the younger stages in species where species
10
11
characteristics develop at more advanced stages of development, and for species which in general are difficult to
identify morphologically. The method has, e.g., been applied with success on eggs and adults of salmon/trout and
various mosses and their hybrids.
Patterns in lower and higher levels of ploidy
In haploid species, only one allele is manifested and therefore only one band is seen at each locus. The existence
of ‘heterozygous’ banding patterns in such species must therefore be due to inter-locus hybrid bands.
Some species (e.g. salmon, several grass species etc) are natural popyloids and may have double, triple etc sets of
genes at their loci. Polymorphisms in such species may show patterns and staining intensities which deviate from
those shown in the figure above, mainly with respect to the symmetry in band intensities of the heterozygotes.
The reason for this is that proteins synthesized at several loci may have the same electrophoretic mobility (lying
‘on top’ of each other in the gel). If, for example, an individual is heterozygous for a dimeric protein on a
duplicated locus, one of the homodimers will be more intensively staining than the other, because the
homodimers from the other (monomorphic) locus have the same electrophoretic mobility. Testing the zymograms
with gel-scanners has shown good correlation between observed and expected staining intensity in cases where
the individuals were scored as heterozygotes on a duplicated locus. Thus in practice the genotype scoring needs
not be a problem at higher levels of ploidy.
The Hardy-Weinberg theorem, and tests for genetic equilibrium
The so-called Hardy-Weinberg principle was formulated simultaneously by the English matematician G.H. Hardy
and the German Phycisist W. Weiberg, and can be expressed like this:
«Single locus genotype frequencies after one generation of random mating can be expressed by the binomial (if
2 alleles) or multinomial (if >2 alleles) function of the allele frequencies»
Under certain assumptions, the allele- and genotypic proportions will be constant over generations and may serve
as population characteristics. These assumptions are:
1.
2.
3.
4.
5.
Panmixia (random mating)
No mutations
Very (infinitely) large population size
No immigration
No selection
Even if no natural population fulfills all these assumptions completely, the effect of deviations from them is
usually not considered large enough to make the test of Hardy-Weinberg proportions meaningless. Such a test is
already shown above. Here follows a brief scetch of the procedure:
First, the number of the different genotypes in a sample is counted: for example AA:51, AB:38, and BB:11
among N=100 diploid individuals.To calculate the expected numbers (under H-W equilibrium) of the three
genotypes we first calculate the frequency of each of the alleles, and insert these frequencies into the binomial
formula. We found 51 individuals with genotype AA (which means double dose A) and 38 heterozygotes AB
which has single dose A. Altogether we have (51+38+11)*2 = 200 alleles in our sample, of which (51*2) + 38 =
140 are A-alleles, giving the frequency 140/200=0.7 for the A-allele. The frequency of the B-allele must then be
1-0.7 = 0.3. The binomial formula:
(p + q)2 = p2 + 2pq + q2,
will, when we insert p=0.7 and q=0.3, give the proportions 0.49, 0.42, and 0.09 as H-W expectations for the three
genotypes AA, AB, and BB. To find the expected absolute number of each genotype, these proportions are
multiplied by the sample size (N=100). The expected number of the three genotypes under Hardy-Weinberg
equilibrium are thus: AA:49, AB: 42, and BB: 9, which are somewhat different from the observed numbers.
Whether the deviation is too large to be coincidental can be tested by a chi-square goodness-of-fit test as shown
in the table above. The principle for calculating chi-square is, for each genotype, to square the difference between
observed and expected number, and to divide the result with the expected number. The numbers thus obtained are
summed to yield the total chi-square which can be looked up in a table of critical values. In goodness-of-fit tests
(but not in chi-square contingency tables), the degrees of freedom (abbreviated DF) needed when looking up in a
chi-square table are calculated as the number of genotypes minus the number of alleles. In the present case this is
(3-2)=1 DF. A chi-square table will tell us the probability that a deviation (represented by the chi-square) of a
11
12
certain size may be due to coincident, or if it is statistically significant and thus probably represents a real
deviation from Hardy-Weinberg proportions.
Nomenclature for loci and alleles
While the name of the protein (enzyme) is spelled out in normal types or as an abbreviation (e.g., lactate
dehydrogenase, LDH), the locus which codes for the protein is by convention written in italics and with and
asterisc (*) after it. If more than one locus code for the same protein, the loci are numbered 1,2 etc starting with
the most cathodic one (e.g. LDH-1*, LDH-2*, LDH-3*) if historic priority doesn’t tell otherwise for the locus
under study.
The alleles at a locus are named by their mobility relative to the most common allele in the total material (pooled
samples from many locations), which is assigned the value 100. A band which migrates half the distance of this
100 band would be called 50. If the migration is in the opposite direction (i.e. towards the cathode), a negative
sign (-) is added to the allele name. Thus a band which migrates half the distance of the 100 band but in the
opposite direction would be called -50. The alleles behind the bands are written in italics . A heterozygote would
thus be called LDH-3* 100/-50. In reports from protein electrophoresis, it has become a convention to present gel
pictures and diagrams with the anode at the top.
References for chapter «Protein electrophoresis»
Darnall, D. W. & Klotz, I.M. 1975. Subunit constitution of proteins - A Table. Arch. Biochem. Biophys. 166:
651- 682.
Harris, H. & Hopkinson, D.A. 1977. Handbook of enzyme electrophoresis in human genetics. Elsevier/Noth
Holland Biomedical Press.
Westermeier, R. 1993. Electrophoresis in practice. VCH Publishers Inc., NY. 277 pp. ISBN 1-56081-705-4.
ooooooooooooooOOOOOOOOOOOOooooooooooooooooo
12
13
Lab. experiment:
ISOELECTRIC FOCUSING IN POLYACRYLAMIDE GEL
(IFPAG) AND HISTOCHEMICAL STAINING OF LDH
ALLOZYMES IN TISSUE EXTRACTS FROM MARINE
GADOIDS
(J. Mork, TBS)
Background:
The heart form (LDH-B) of the tissues enzyme LDH (lactate dehydrogenase; E.C. 1.1.1.27) is polymorphic in
Atlantic cod. Two common and several rare alleles have been demonstrated in cod along the Norwegian coast.
This experiment uses cod LDH to demonstrate robust techniques for IFPAG, histochemical staining of enzymes,
interpretation of gel patterns, genotyping, and and estimation of allele frequencies.
IFPAG:
The analytic method will be Isoelectric Focusing in Polyacrylamide Gels (IFPAG). A broad pH gradient gel
(Serva-Lyte 4-9 technical grade) will be used, and the final ampholyte concentration will be 2%. The gel will
have a total acrylamide concentration (T) of 5%, and the degree of crosslinking ( C ) will be 3%. I M of
phosphoric and sodium hydroxide, respectively are used for the anode and cathode electrode wicks.
Photopolymerization (Riboflavin-5-P) is used. Procedures for mounting gel cassettes and moulding gels are
demonstrated.
Instrumentation:
Bio-Rad «Biophoresis» electrophoresis apparatus equipped for IFPAG. Power Supply is a LKB 2103, and the
cooling circulator is a Desaga Frigostat thermostated at 4C.
The analysis:
Homogenization of tissues and extraction of proteins are demonstrated. The steps in the analysis are:
1 - Mounting of gel on the cooling plate of the apparatus (5 min)
2 - Prefocusing (5 min)
3 - Cathodic applic. of ~50 paper pieces (incl. standards) soaked in tissue extracts (10 min)
4 - Focusing for 10 min and then removal of sample paper pieces
5 - Completing the focusing during another 60-75 min
Towards the end of the focusing the LDH staining solution is made ready (80 ml 0.5 M Tris-HCl pH 10.0, 1
gram Na-lactate, 10 mg each of NAD, NBT and PES).
6 - After completing the focusing, the electrode wicks are removed, and the gel is incubated in the staining
solution (dark, 40 C) until the blue formazan bands formed at the sites of activity are strong enough to allow
genotyping (5-15 min).
7 - The enzyme reaction is stopped by placing the gel in 20% TCA (10 min).
8 - Gels are then washed in generous amounts of acetic acid:ethanol:water (1:5:4) containing 10% glycerol (60
min).
9 - The gel is mounted (paper clips) on a glass plate and allowed to dry in until its surface is sticky (due to the
glycerol).
10 - The gel is then covered by another plastic sheet which is rolled onto the sticky surface.
11 - The plastic covered gel can be filed as a permanent record of the experiment.
Literature describing the methods:
Mork, J. & Haug, T: 1983. Genetic variation in halibut (Hippoglossus hippoglossus (L.) from Norwegian waters.
Hereditas 98: 167-174.
13
14
Lecture:
ANALYSIS OF GENETIC DIFFERENTIATION AND
STRUCTURE
(J. Mork, TBS)
Evolution can be defined as any change in population gene frequency. Given sufficient time and degree of
isolation, the evolutionary forces (mutation, genetic drift, gene flow and selection) will eventually result in
different gene frequencies in different populations («time» may mean anything from a few, to hundreds of
thousands of generations, depending on population size).
There are many models for describing genetic differentiation. One of the best known and most frequently used is
Sewall Wright’s «Mainland-Island model». It is based on a situation where a start population («Mainland») is
split into many isolated subpopulations («Islands»), and describes the genetic differentiation between these over
generations using formulae which includes e.g. population size, migration rates, and the number of generations
since population splitting (and thereby reproductive isolation). Wright utilized a specific statistic - the Fst - as a
measure of the degree of differentiation. The Fst value tells what proportion of the total genetic variability in the
material is caused by genetic differences between populations (the rest is of course due to differences between
individuals, i.e. within populations). For example, an Fst value of 0.10 would mean that 10% of the total
variation can be attributed to differences between samples. It is worthwhile to mention that because Fst is a
relative measure (between/within), its value is not expected to be affected by the type of genetic marker used
(i.e., markers with different evolutionary rates, like e.g. isozymes and microsatellites, would be expected to yield
similar Fst estimates when applied on the same material). Measures of absolute genetic differences, on the other
hand (like Nei’s genetic distance D), are expected to give different results depending on the evolutionary rate
(mutation rate) of the actual marker. For example, mini- and micro-satellites are expected to give larger D-values
than isozymes, and this has been shown to be the case also in practice.
WRIGHT’S FST , (A RELATIVE MEASURE OF DIFFERENTIATION)
To understand the nature of Fst it is useful to have some knowledge of the Hardy-Weinberg theorem and the
socalled Wahlund-effect. The latter tells that in a physical mixture (i.e., not an interbred group) of individuals
from two or more populations with different gene frequencies, the mixture will show a deficit of heterozygotes
compared to the expected (Hardy-Weinberg) proportion calculated from the joint gene frequency in the mixture.
This effect is easy to understand if looking at the extreme situation where two populations with gene frequencies
of 1.0 and 0.0, respectively, provide one half each of a mixed sample. The joint gene frequency in the mixed
group will necessarily be 0.5, and from this we would expect a proportion of heterozygotes of (0.5*0.5*2=) 0.5
from the Hardy-Weinberg theorem, while the mixed group actually have no heterozygotes! Smaller differences in
allele frequencies, or skewed proportions between the groups involved will of course create correspondingly
smaller deficits of heterozygotes, but whenever observed, a significant deficit of heterozygotes is an indication
that our sample consists of a mixture of two or more populations with different gene frequencies.
Observed heterozygosity is simply the proportion of heterozygotes in a sample
Expected heterozygosity (H) on a locus, however, is calculated from the observed allele frequencies:
H = 1- xi2
where xi is the frequency of the ith allele. Mean expected H is written with a ‘bar’ above it and is the arithmetic
average H at all the investigated loci (usually both monomorphic and polymorphic loci are included). Relevant
software: Hetzyg.exe (J. Mork)
The rationale behind Fst is that by the ‘start’ of differentiation (i,.e., when reproductive isolation occurs), all the
(sub)populations have the same allele frequencies and genotype frequencies at all loci. Assume that the allele
frequency at a 2-allel polymorphic locus is 0.5. Over generations, the allele frequencies and thereby the genotype
frequencies will diverge between populations. The amount of divergence due to genetic drift will depend on
population sizes and the number of generations. If, at one point in time, all the genotypes in all the populations
are pooled in one large table, the proportion constituted by of heterozygotes will be a lesser number than that
calculated (the H-W expectation) from the the ‘joint’ allele frequencies of that mixed group. The deficit will
increase with time (generations), until eventually all the (sub)populations are fixed for one or the other allele, and
14
15
no heterozygotes are observed at all. (The expected proportion, which is based on joint allele frequencies, is
however constant and hence the same as in the undivided start population).
One way to look at this process is that the genetic variability, which at the start was entirely located within
populations, is more and more transformed to be between populations. Fst is actually a measure of the fraction of
the total genetic variation which can be attributed to differences between populations (the so-called ‘between’
component).
The formula for Fst is:
Fst = 1 - Hmean / Htotal
where Hmean is the arithmetic mean of the heterozygosities in all the subpopulations, while Htotal is the expected
heterozygosity based on the joint allele frequency in pooled subpopulations. It is evident from the formula that
Fst equals 0 when the subpopulations are identical in allele frequencies, and 1 when they are fixed for different
alleles.
Wright’s Fst is basically a measure for single loci. Masatoshi Nei has suggested another statistic which utilizes
information from several loci simultaneously. The statistic is analogous to Wright’s Fst , but is called Gst and
calculated from allele frequencies rather than genotype frequencies (assuming Hardy-Weinberg equilibria in all
subpopulations). Nei’s statistic is called Gst. Relevant software: GSA.exe (J. Mork)
NEI’S I OG D (ABSOLUTE MEASURES OF DIFFERENTIATION)
Masatoshi Nei has also suggested another measure, called D (genetic distance) which provides an estimate of the
absolute genetic differences between populations. («.. mean number of amino acid substitutions per locus»). This
statistic utilizes allele frequencies at multiple loci, and is calculated for each locus via the statistic I («genetic
Identity»). The formula is:
I = xiyi / SQR[(xi2)(yi2)]
where xi og yi are frequencies of the i-th allele in population X og Y, respectively (SQR means square root).
Furthermore,
D = - ln(I).
It is common to calculate the arithmetic mean D when dealing with more than one locus.
Relevant software: DG25.exe, DG50.exe, DG100.exe (J. Mork), BIOSYS (D. Swofford)
CLUSTER ANALYSIS AND DENDROGRAM CONSTRUCTION
In studies of intraspecific genetic structure it is recommendable to have information on allele frequencies at many
polymorphic loci. An efficient way of illustrating the calculated similarities and differences between groups is to
perform cluster analysis. The method outlined below is the UPGMA (Unweighted Paired Group Method of
Arithmetic Average), which is one way to present complex matrix data graphically. There are many others.
First, the mean I or D between all pairwise combinations are calculated as explained above. The result can, e.g.,
be arranged in a matrix with the OTUs (Operational Taxonomic Units) along both axes.
The two OTUs with the smallest D (or largest I) between them are then fused into one OTU, and the I or D value
(the mean of the values of the two original OTUs) for this new OTU towards all the others is recalculated in a
new matrix. Then again the two OTUs with the smallest distance are fused, followed by a new re-calculation.
This procedure is repeated in a cyclical way until all the OTUs are parts of the same cluster.
The dendrogram which can be constructed from this gives a graphical presentation of the similarities between
OTUs in the total material.
Example (UPGMA cluster analysis of Nei’s genetic distances, and dendrogram construction):
Consider samples from 3 populations (OTUs). In each sample, genotypes at 3 loci (called HbI*, LDH-3* and
IDHP-1*) with 2 alleles at each are scored by electrophoresis, giving the following values (for sake of simplicity,
15
16
the alleles are called S and F, and the genotypes thus SS, SF, and FF at all loci. qF and qS are the calculated
allele frequencies of F and S):
Population 1:
Locus
HbI*
LDH-3*
IDHP-1*
genotype SS
25
81
36
genotype SF
50
18
48
genotype FF
25
1
16
N
100
100
100
qF
0.5
0.1
0.6
qS
0.5
0.9
0.4
Population 2:
Locus
HbI*
LDH-3*
IDHP-1*
genotype SS
81
25
25
genotype SF
18
50
50
genotype FF
1
25
25
N
100
100
100
qF
0.9
0.5
0.5
qS
0.1
0.5
0.5
Population 3:
Locus
HbI*
LDH-3*
IDHP-1*
genotype SS
64
36
49
genotype SF
32
48
42
genotype FF
4
16
9
N
100
100
100
qF
0.8
0.6
0.7
qS
0.2
0.4
0.3
Calculation of genetic distances:
Formulae:
I = xiyi / SQR[ (xi2)(yi2)],
and
D = - ln(I)
Calculation of I-values and D-values from observed allele frequencies:
Population 1 versus population 2:
HbI*:
I = (0.5*0.9) / SQR[(0.25+0.81)*(0.25+0.01)] = 0.7810
LDH-3*:
I = (0.9+0.5) / SQR[(0.81+0.25)*(0.01+0.25)] = 0.7810
IDHP-1*:
I = (0.6+0.5)*(0.4+0.5) / SQR [(0.36+0.16)*(0.25+0.25)] = 0.9803
Mean
I = (0.7810+0.7810+0.9803) / 3 = 0.8474
Mean
D = -ln(0.8474) = 0.1656
Population 1 versus population 3:
HbI*:
I = 0.8575
LDH-3*:
I = 0.8274
IDHP-1*:
I = 0.9820
Mean
I = 0.8890
Mean
D = 0.1180
Population 2 versus population 3:
HbI*:
I = 0.9906
LDH-3*:
I = 0.9803
IDGP-1*:
I = 0.9285
Mean
I = 0.9665
Mean
D = 0.034
Presenting the results of calculations in the first cycle in matrix form:
16
17
Matrix 1
Population 1
Population 2
Population 3
Population 1
-0.1656
0.1180
Standard Genetic Distances (Nei 1972)
Population 2
Population 3
-0.034
--
The smallest value of pairwise genetic distance in this matrix is between populations 2 and 3. Therefore, these
two populations are combined into one (and will be connected by the lowest level bifurcation in the dendrogram).
The genetic distance between this ‘combined’ population and population 1 is then calculated as the arithmetic
mean of the two distances that population 2 and population 3 originally had towards population 1, i.e. mean
D=(0.1656+0.1160)/2 = 0.1408, and a new matrix can be filled in:
Matrix 2
Population 1
Population (2+3)
Standard Genetic Distances (Nei 1972)
Population 1
Population (2 + 3)
-0.1408
--
This procedure of joining the nodes with the smallest D-value in each cycle and then recalculating the matrix
proceeds until all populations have been joined. In the current example with three population there will be two
nodes (population 1 and the combined population 2/3) in the dendrogram which can be drawn on basis of the
values in matrices 1 and 2:
Among relevant software for cluster analysis and dendrogram construction are e.g.:
DG25.exe, DG50.exe, DG100.exe (J. Mork), BIOSYS (D. Swofford), GNKDST (M. Nei)
oooooooooooooOOOOOOOOOooooooooooooo
17
18
Lab. experiment:
STARCH GEL ELECTROPHORESIS OF FISH TISSUE
ENZYMES
(J. Stien, J. K. B. Forthun & M-A. Østensen, TBS)
(Figures 1 and 2 should not be reproduced without permission from the authors)
Preparing biological material
Tissues to be used in electrophoresis are preferably cut out and extracted as soon as possible after death
of the animal, although storage in a frozen condition may give satisfactory results as well if the
temperature is sufficiently low.
Storage
Use e.g. sealed plastic bags (avoid air pockets), adequately marked with species name, locality, date,
total number of individuals, individual numbers, tissue types, etc..
Tissues and enzymes may vary dramatically in their storing capacities. Fatty tissues (and their enzymes)
like liver are generally less suited for prolonged storage. The following storage times apply to liver
tissues of codfishes (approximate values):
Roomtemperature
Refrigerator (0 til 4 °C)
Freezer (ca. 18 °C)
Biofreezer (-65 til -85 °C)
a few hours
< a week
a few months
many years
Extraction
When extracts of enzymes are to be stored refrigerated for more than one day, bacterial and mucoid
growth inhibitors may preferably be added to the extraction liquid. If storage is not necessary, extraction
can be made in distilled water. Equal amounts (volume) of tissue and solution are suitable for most
practical work on fish.
Homogenization
Repeated freezing and thawing breaks the cell walls, and releases cytosol to the extraction liquid.
Manual shaking increases mixing efficiency.
Cell walls may also be broken by ultrasound treatment. Such treatment generates heat which can damage
the enzymes, and should be performed with efficient cooling.
Efficient homogenization is obtained by manually crushing the cell walls, e.g. with a glass rod in an
Eppendorf tube. Best effect is achieved with partly frozen tissue (0 °C). If many samples are to be
treated, cooling in ice bath may be neccesary.
Centrifugation
At TBS the Eppendorf tubes are centrifuged (here in a Sorvall Instruments RC5C, Rotor code 12, 10 000
rpm (~10 000 g) for 10 minutes at 2-4 °C). An ordinary table top centrifuge with > 2000 rpm and
somewhat longer runs may serve well if the temperature is kept low.
Preparering of medium / the starch gel
Use hydrolyzed potato starch of analytical quality. The concentration of starch in the buffer solution
depends on the batch from the producer. At TBS starch from Sigma Chemicals is used. This gives
adequate gels at 10-11% (w/v) concentration.
18
19
Boil with constant agitation over gas-flame in e.g. an Erlen-Meyer flask (double the volume of the gel).
Too little agitation may result in burned starch at the bottom of the flask.. The air content in the solution
is removed by vacuum suction before the gel is poured. Due to the danger of implosion, only completely
intact flasks must be used for this! When only big gas bubbles are formed by suction, the gel is ready
for pouring.
Pour the the solution directly onto the (pre-heated to approx. 50C for this purpose) thermostated plate.
Bubbles induced during transfer should be removed e.g. with a pipette (they can make trouble when the
power is applied). Set the temperature of the thermostated plate to approx. 4°C to let the gel solidify. If
not used immediately after cooling, the gel should be covered with a plastic wrapping to avoid
evaporation and drying. When covered and cool, the gel may be stored over night. Before use, any
condensation on the gel surface should be wiped off.
Application on the gel
One or more slots are cut in the gel, depending on the width of the gel and on what type of buffer system
is used. In continous systems (i.e. Clayton & Tretiak 1972) two or even three slots can be used, while
discontinous systems (i.e. Ridgway et al. 1970) allow only one slot. Allow 2 cm of the gel on the anodal
and cathodal side for electrode contact, and allow sufficient space for fast migrating enzymes in the
anodal part.
Absorb the protein extracts in pieces of filter paper and load them side by side into the slot cut in the gel.
In routine analyses, the paper pieces may be as narrow as 1 mm. Allow sufficient space between pieces
to avoid contact between them. Subsequent scoring of the indivudual isozyme genotypes is eased by
applying e.g. groups of ten individuals, separated by a paper piece with a marker dye.
Figure 1. The figure shows how to prepare and set up for SGE. The gel is resting on a cooling plate
(2-3 °C) and connected to a power supply using two electrodes. The electrodes are connected to the
gel via two buffer vessels. (+) = anode, (-) = cathode. The figure shows a gel with three lines of slots.
19
20
Electrophoresis
If cooling is only from the bottom of the gel, a plastic wrapping should cover the surface of the gel
during the run to avoid evaporation and drying.
The application pieces should be removed after 10-15 minutes. The applied power/voltage is reduced
during sample application.The total duration of the run depends on type of buffer system, the individual
enzymes and pH in the gel. Suitable time for good separation are learned through experience.
Gel slicing
Cut the gel into slices of suitable size for staining. Many types of apparatus has been developed for this.
At TBS a simple gear made of a dental string, glass plates and weights are used (Figure 2). The thickness
of the slices depends on how many enzymes are to be stained for, and on the shear strength of the gel.
Figure 2. “Gel-slicer”. Prior to specific histochemical staining, the gel is cut into 1 mm thick slices
using a device such as that indicated by the drawing.
Histochemical staining
Staining occur normally at the highest rate between 30-35 °C. Because of different transcription rates in
cells, different enzymes exhibit large variation in staining intensity. Substrate and staining solution are
mixed while the gel is running. Light sensitive reactants (i.e. Nitro blue tetrazolium salt (NBT) and
Phenazine ethosulphate (PES)) is however added as late as possible to reduce nonspecific staining
caused by ambient light.
20
21
LDH:
Tris-HCl, 0,2 M, pH=9,0
DL-Na-Lactate
NAD
NBT
PES
50 ml
700 mg
5 mg
10 mg
5 mg
(0,5 ml)
(1,0 ml)
(0,5 ml)
PGM:
Tris-HCl, 0,4 M, pH=9,0
12,5 ml
a-D-Glucose-1-phosphate
150 mg
G-6-PDH
40 u
MgCl2
20 mg
NADP
5 mg
NBT
10 mg
PES
5 mg
----------------------------------------------------------Agar
375 mg
dest. vann
12,5 ml
PGI:
Tris-HCl, 0,4 M, pH=9,0
12,5 ml
a-D-Fructose-6-phosphate
30 mg
G-6-PDH
30 u
MgCl2
20 mg
NADP
5 mg
NBT
10 mg
PES
5 mg
----------------------------------------------------------Agar
375 mg
dest. vann
12,5 ml
(2,0 ml)
(0,5 ml)
(1,0 ml)
(0,5 ml)
(2,0 ml)
(0,5 ml)
(1,0 ml)
(0,5 ml)
Agar/agarose overlay
To reduce diffusion of intermediate products in the gel during staining, the staining reagents can be
mixed in a liquid agar solution which solidifies after pouring onto the gel. The most useful agaroses are
those that with a solidifying temperature slightly above the incubation temperature (40 °C in a heated
locker or 30 °C in room tempreature). Due to the smaller volume, a higher concentration of the staining
reagents can then be used with no increase in cost.
Gel pattern recording
Read the gel slices while the bands ar clear and distinct, and before fixation of the bands. To read weak
bands, some degree of overstaing is often necessary. Make sure that all sufficiently clear bands are read
before overstaing the gel.
Data for individual genotypes are written on suitable forms. On these forms other data of the individuals
are also included, like species, length, weight, age, sex, gonad maturation stage, etc. Of course, sample
data like locality, date, depth etc should follow the sample.
21
22
Fixation and preservation of gels
The gel structure hardens and the staining bands are fixed in a solution containing ethanol, water and
acetic acid (5:4:1). DO NOT TOUCH THE GEL WITHOUT GLOVES AFTER STAINING! It will
allways contain traces of the staining solution.
Storing
Freezing. Wrap the gel into clear plastic, to prevent air contact. Add a note with a number or short
description to each slice to connect the slices to the forms for station- and individual data. Store the
slices in an ordinary freezer (-18 °C).
Drying. After fixation the slices can be placed on filter paper sheets and put onto a gel dryer. Be careful
not to destroy the slices with too high temperature. Stop the drying process before the paper wrinkles.
Write the ID of the slice on the paper.
Photography. Use ordinary equipment for repro photography. Uneven heat removal in the gel might have
influenced the migrating rate in different sections. Light through the gel can therefore give a more
diffuse impression than light from above. Remember to photograph the ID together with the slice.
Cleaning of equipment
Some of the chemicals used are mutagenes/poisonous. RINS ALL GLASS FROM THE STAINING
PROCEEDURE WITH COLD WATER BEFORE WASHING! This is to avoid breathing of toxic
damps.
OoooooooooooooOOOOOOOOOOOOOoooooooooooooo
22
23
Lab eksperiment
BRUK AV ISOZYMER FOR Å STUDERE HYBRIDSONER
diploid-tetraploid hybridsone hos orkideer
Sigurd Såstad
Bot. Avd., Institutt for Naturhistorie, Vitenskapsmuseet, NTNU
Interessen for hybridsoner har vært økende de senere år fordi man antar at krysninger mellom genetisk
divergente individer har hatt avgjørende betydning for evolusjon i en rekke plante og dyregrupper
(Arnold 1997). Hybridisering og introgresjon kan føre til økt genetisk diversitet innen arter,
overføring av genetiske adapsjoner mellom arter, oppbygging eller nedbrytning av reproduktive
barrierer mellom nært beslektede grupper og dannelse av nye økotyper eller arter (Barton & Hewitt
1989). Hybridsoner som består av cytotyper med forskjellige ploidnivå har spesiell interesse fordi de
gir mulighet til å studere hvilke mekanismer som er involvert i tidlige stadier av polyploid
artsdannelse, og hvordan reproduktive isolasjons-mekanismer påvirker etablering av polyploider i
diploide populasjoner (Thompson & Lumaret 1992, Petit et al.1999).
Fordi nydannede polyploider finnes i sterkt mindretall i diploide populasjoner, vil disse omtrent alltid
forsvinne med mindre de har egenskaper som gjør dem i stand til å etableres i populasjonen, eller til å
kolonisere nye områder (’minority cytotype exclusion principle’; Levin 1975). Cytotypen som er i
mindretall vil oftere pollineres av den dominerende cytotypen og slik produsere ikke spiredyktig
avkom (’triploid block’) eller evt. stort sett sterilt triploid avkom (Petit et al. 1999). Modeller for en
neopolyploids etableringssuksess avhenger av faktorer som 2n-gametproduksjon hos diploider, relativ
fitness hos polyploider kontra diploider, og grad av fertilitet av triploide individer (Felber 1991,
Felber & Bever 1997). Dersom høyere konkurranseevne hos polyploiden kombineres med høyere
fekunditet, selvbefruktning og/eller habitatsegregering mellom cytotypene, øker dette sannsynligheten
for polyploid etablering (Rodriguez 1996).
23
24
Undersøkelsesarter: Dactylorhiza incarnata ssp. cruenta x Dactylorhiza lapponica
Slekten Dactylorhiza inkluderer flere taxa som er endemiske for Nordvest Europa. Mange medlemmer
av slekten har svært variabel morfologi, og da enkelte taxa ofte har en sympatrisk utbredelse, er
hybrider relativt vanlige (Hedrén 1996, Malmgren 1992). Dactylorhiza er følgelig innad en
taksonomisk dårlig definert / avgrenset gruppe. Noen populasjoner antas å bestå kun av hybridogene
individer, mens andre regnes å ha sitt opphav i hybridogene stamformer. Både morfologiske og
cytologiske data indikerer at evolusjonen i Dactylorhiza i høy grad er retikulat (Hedrén 1996).
En hybridsone mellom den diploide Dactylorhiza incarnata ssp. cruenta (blodmarihand, 2n = 40), og
den tetraploide D. lapponica (lappmarihand, 2n = 80) er utgangspunktet for dette prosjektet. Sikker
hybridisering mellom disse artene er bare kjent fra Røros, Sør-Trøndelag (Lid & Lid 1994), i et
gammelt kulturlandskapsområde (Sølendet naturreservat; Moen 1990). I dette området er imidlertid
hybriden svært vanlig og til dels dominerende i områder som tradisjonelt har vært påvirket av slått.
Disse områdene skjøttes i dag for å forhindre gjengroing etter at den tradisjonelle utmarksslåtten
opphørte. Dactylorhiza incarnata ssp. cruenta finnes primært på mykmatter i rikmyrsområder, mens
D. lapponica har sine primærområder ved kalkkilder og i ekstremrike fastmatter på myr. En antar at
rydding av slått ved å forhindre gjengroing, har åpnet mange nye potensielle habitater for begge
artene, noe som også har gitt etableringsmuligheter for hybriden mellom dem.
D. lapponica er en av mange allotetraploide arter som er resultatet av en krysning mellom tidlige
varianter av Dactylorhiza incarnata og Dactylorhiza fuchsii (skogmarihand, 2n=40; Hedrén 1996).
Sekundære hybridsoner mellom allopolyploider og deres diploide foreldrearter er etter hva vi vet ikke
tidligere rapportert (cf. Petit et al. 1999).
Formål med øvingen:
Gjøre en isoelektrisk fokusering av enzymene PGI (dimert enzym) og PGM (monomert enzym).
Materialet er en tetraploid orkide (Dactylorhiza lapponica), og dennes potensielle foreldrearter
(D. incarnata ssp. cruenta og D. fuchsii), samt av hybriden mellom lapponica og D. incarnata ssp.
cruenta. Utfra resultatene skal vi:
Vurdere elektroforetiske mønster hos tetraploiden, og forsøke å finne ut om dette er en allo eller autopolyploid.
En allotetraploid vil oppvise disomisk nedarving med fiksert heterozygoti (dvs. homologe
kromosomer nedarvet fra divergente linjer sjelden eller aldri vil pares i meiosen; Weeden & Wendel
1989; Figur 1).
En autotetraploid vil ha tetrasomisk nedarving, med distinkt segregering i forventede ratioer (homo og
heterozygoter).
24
25
Vurdere om diploidene er sannsynlige foreldrearter til tetraploiden
Vurdere de elektroforetiske mønstrene hos hybridene, og finne ut om disse stemmer overens med hva
vi skulle forvente i en første-generasjons triploid hybrid mellom diploiden og tetraploiden, evt. om
den synes å utgjøre en tilbakekrysning med foreldreartene.
25
26
Fig. 1. Forventede nedarvingsmønstre hos allo- og auto-tetraploider.
26
27
Ekstraksjon og Elektroforese
Ekstraksjon: for isozym-analyse av plantemateriale er ekstraksjon et kritiske trinn. Planter
inneholder generelt en god del kjemiske forbindelser som kan virke nedbrytende når de kommer i
kontakt med enzymene ved homogenisering av materialet.
Oversikt over tilsetningsstoffer i homogeniseringsbufferen og deres antatte virkemåte:
Skadelige vevs- reaksjon med protein
reaksjonsforhold
substanser
Phenoler
H-bindinger til O-atomer i
surt/nøytralt
proteinenes peptid bindinger
Quinoner
Reagerer med NH2 og SH
grupper (sees som bruning
av vev)
Phenoloxidaser
Tilsetningsstoff
Virkemåte
PVP
Caffein
H-bindinger til phenoler danner
uløselige forbindelser under
sure/nøytrale forhold. PVP kan
inhibere glutamin synthetase
Dannes fra phenoler v.h.a. phenol Natriumascorbat
reduserende agenter (reduserer
oxidaser ved basiske betingelser Natriummetabisulfit quinoner?). Kan inhibere enkelte
(mercaptoetanol?)
dehydrogenasesystemer
basisk
Natriumborat
inhiberer O-diphenol oxidase
Mercaptoetanol?)
DIECA
inhiberer phenol-oxidaser ved å
virke på det kobberholdige aktive
senter. Kan inhibere SOD
DMSO
Stabilisering av ekstrakt
Ekstraksjonen foregår ved knusing av materiale med pistill på is. Ekstrakt suges opp på filtrerpapir og
fryses ned til -80 grader (kan oppbevares i flere måneder).
Elektoforese: Ved kjøring av testmateriale viser det seg at de aktuelle orkide-enzymene har et svært
lav isoelektrisk punkt (pI). Ved IEF med pH gradient 4-9 forsvinner de aktuelle enzymene ut i anoden
ved full fokusering. Dermed må kjøringen avbrytes etter fra 5-20 min for deretter å farge umiddelbart
Fargemekanisme (PGI): (stoffer merket med * tilsettes fargeløsning)
Fructose-6-fosfat*
PGI
Glukose-6-fosfat
G6PD*, Mg++*
NADP*
6PGA
NADPH/PES*/NBT*
Farging skjer ved bruk av agar overlay teknikken. Gelen plasseres ved 37 grader i mørket i 10-30 min,
før fiksering.
Litteratur:
Arnold ML. 1997. Natural hybridization and evolution (Oxford Series in Ecology and Evolution).
New York: Oxford University Press.
Barton NH, Hewitt GM. 1989. Adaptation, speciation and hybrid zones. Nature 341: 497-503.
Bretagnolle F, Thompson JD. 1995. Gametes with somatic chromosome number: mechanisms of their
formation and role in the evolution of autopolyploid plants. New Phytologist 129: 1-22.
Felber F. 1991. Establishment of a tetraploid cytotype in a diploid population: effect of relative fitness
of the cytotypes. Journal of Evolutionary Biology 4: 195-207.
Felber F, Bever JD. 1997. Effect of triploid fitness on the coexistence of diploids and tetraploids.
Biological Journal of the Linnean Society 66: 95-106.
Hedrén M. 1996. Genetic differentiation, polyploidization and hybridization in northern European
27
28
Dactylorhiza (Orchidaceae): evidence from allozyme markers. Plant Systematics and Evolution 201:
31-55.
Levin DA. 1975. Minority cytotype exclusion in local plant populations. Taxon 24: 35-43.
Lid J, Lid DT. 1994. Norsk flora. Oslo: Det norske Samlaget.
Malmgren S. 1992. Hybridisering bland svenska orkideer - korsnings - och odlingsforsök. Svensk
Botanisk Tidskrift 86: 337 - 346.
Moen A. 1990. The plant cover of the boreal uplands of central Norway. I. Vegetation ecology of
Sørlendet nature reserve; haymaking fens and birch woodlands. Gunneria 63:
Petit C, Bretagnolle F, Felber F. 1999. Evolutionary consequences of diploid - polyploid hybrid zones
in wild species. Trends in Ecology and Evolution 14: 306-311.
Thompson JD, Lumaret R. 1992. The evolutionary dynamics of polyploid plants: origins,
establishment and persistence. Trends in Ecology and Evolution 7: 302-307.
Weeden, N.F. & Wendel, J.F. 1989 Visualization and interpretation of plant isozymes. In Soltis DE
& Soltis PS (eds.). Isozymes in plant biology. Dioscorides, Portland. pp 46-63.
oooooooooo00000000000000000oooooooooo
28
29
RFLP MARKERS
The cDNA RFLP SypI
Sten Karlsson
TBS, Inst. for Naturhistorie, Vitenskapsmuseet, NTNU
Synaptophysin (SypI) is a population genetic marker for cod, belonging to the class of markers called
RFLP (Restriction Fragment Length Polymorphism). This locus is coding for an integral membrane
protein of synaptic vesicles. Primers have been constructed for this gene. The forward primer (B) is
situated in the third exon of the gene and the reverse primer 52 bp beyond the termination codon. In
short: the polymorphism is due to presence or absence of restriction site. The uncut PCR product is
1051 bp in length. When this fragment is exposed for a six base pair restriction enzyme (Dra I) all the
genotypes are cut into two 773 bp fragment and two 278 bp fragment. If there are no other fragment,
the individual is homozygous AA. If there is a restriction site for the restriction enzyme Dra I the 773bp fragments will be cut into two 495bp fragment and two 278bp fragment. In this case the individual
is homozygous BB. If there is a restriction site in only one of the homologous genes the individual is
heterozygous AB. This individual will produce one 773bp fragment, one 495bp fragment and three
278 bp fragments. All these fragment are separated and visualized on an agarose gel (Figure below).
773 bp
278 bp
495 bp
29
30
The procedure for genotyping
The procedure for genotyping can be divided into four steps. The first step include isolation of DNA,
the second step PCR amplication of the gene, the third step cuting of the gene by the restriction
enzyme DraI and the final step, electrophoretic separation on an agarose gel.
DNA isolation
In this lab course, DNA will be isolated from liver.
* A small piece of liver (approx. 70mg) is cruched with a glass rod in a 2 ml plastic tube.
* 700l of proteinase-K buffer is added and 4l of proteinas-K
* The tubes are placed in heat-cupboard, adjusted to 50C and incubated over night.
* To each tube add 400l of Tris saturated phenol and 600l of isoamylalcohol-chloroform (1:24)
* Rotate tubes for 30 minutes, followed by centrifuging in 5000g for 15 minutes
*Maximum 500l of the upper aquos phase is carefully sucked out with a pipette and transferred to
new sterilized tubes.
* The DNA is precipitated in 2 times the volume of ice cold 96% ethanol.
* The DNA pellet is rinsed in 70% ethanol by carefull rotation for 30 minutes.
* The ethanol is discarded and the DNA pellet is allowed to dry.
* The DNA pellet is resuspended in 100l of TE-buffer or sterilized water.
Proteinase-K buffer: 1ml 1M Trisbuffer, 0.1ml 0.5M EDTA, 0.5ml 10% SDS, 8.4ml dH2O
Proteinase-K stocksolution: Add 5ml of 50% glycerol/ water mixture to 100mg Proteinas-K
PCR
Mastermix
l
dH2O
PCR reaction buffer (1X)
MgCl2 (3.75M)
dNTP (0.25mM)
Primer (SYN 7) reverse (0.072M)
Primer (B) forward (0.0705M)
*Taq
7.9
2
3
4
0.6
0.6
1
* Taq is diluted 1:5
30
31
Normaly Taq is excluded from the mastermix. Instead the Taq is added to each tube when everything
else is added. Before the tubes are placed in the PCR apparateus 30 l of mineral oil is added to
prevent the liquid to evaporate.
The following program is run on the PCR:
Start: 5 min. 94C denaturation
30 cycles: denaturation (94C 30 seconds), annealing (55C 30 seconds), extension (72C 30 seconds)
Stop: 7 min. 72C
Cutting with restriction enzyme (DraI)
Mastermix
l
RE-buffer
dH2O
*DraI
2
4
2
* DraI is diluted 1:4
The mastermix is added to each tube and 10 l of the PCR product. Incubation for 90 minutes in
37C.
The digestion stops by adding 1 l of 0.2M EDTA and 4l of loading dye.
Electrophoretic separation
The fragments are separated and visualized on an agarose gel.
A 50ml agarose gel is prepared by adding 1g of agarose to 50ml of 1X TBE buffer and 1.5l of
ethidium bromide.
10 l of the product, obtained from the restriction enzyme digestion is loaded into each well of the
gel, which is submerged in 1X TBE buffer. The gel is run for approximately 20 minutes, with a
maximum voltage of 120.
The fragments are visualized on a UV-light board and photographed with a polaroid camera.
31
32
Literature
Carvalho, G. R. & T. J. Pitcher (red.). 1995. Molecular genetics in fisheries. Chapman &
Hall. London. 141 pages.
Hillis, D. M., C. Moritz & B. K. Mable. Second edition. 1996. Molecular Systematics.
Sinauer Associates, Inc. Publisher Sunderland, Massachusetts. 655 pages.
Fevolden, S. E & G. H. Pogson. 1997. Genetic divergence of Atlantic cod at the
synaptophysin (SypI) locus among Norwegian coastal and north-east Arctic
populations of Atlantic cod. Journal of fish biology 51: 895-908.
Pogson, G. H., K. A. Mesa & R. G. Boutilier. 1995. Genetic population structure and Geneflow in the
Atlantic Cod Gadus morhua: A comparison of Allozyme and Nuclear RFLP loci. Genetics 139: 375385.
ooooooooooooooooOOOOOOOOOOOOOOoooooooooooooooo
32
33
DNA MARKERS
(MINI- AND MICROSATELLITES, PCR)
Anthony Ryan
Max Planck Institute for Evolutionary Anthropology, Inselstraße 22, 04103 Leipzig, Germany.
The following discussion is a general outline of laboratory techniques. A more detailed description of
these is given in the “Core Reading” listed in the references section.
Tissue preservation
Several methods exist for storing tissue samples prior to DNA extraction. If possible, it is better to
freeze the tissue samples, so that both protein and nucleic acid components (DNA, RNA etc.) may be
analysed. However, this can be impractical in situations where samples must be collected in the field
or transported without freezing, for example by airmail. One alternative is to store and transport tissue
samples, such as gill or muscle from fish, in several volumes of absolute alcohol (“several volumes”
means that the volume of ethanol must be two to three times the volume of the tissue sample). Under
these conditions, the degradation of the sample by bacteria or fungi is inhibited by alcohol.
Alternatively, tissues may be transported frozen, for example in dry ice. However, the cost is often
prohibitive. For the extraction of DNA from fish samples, gill tissue stored and transported in ethanol
give high quality DNA extracts which are sufficient for most laboratory applications.
DNA Extraction
In order to isolate nucleic acids from tissue samples, it is first necessary to disrupt the cell membranes
and remove the proteins which are present. Routine DNA extraction protocols usually begin by
digesting the tissue samples for several hours using a protein-degrading enzyme, such as Proteinase
K. After this stage, it is often desirable to degrade the RNA which is present in the samples. This is
achieved using an enzyme called RNAse.
Afterwards, the degraded protein samples must be separated from the nucleic acids. Here, two
common protocols are used. In the first, called Phenol/Chloroform Extraction, the samples are
extracted with phenol and chloroform. These solvents remove the protein fraction from the sample,
and DNA can be precipitated from the resulting solution. Alternatively, the proteins can be
precipitated from the solution by adding a concentrated salt solution, in a process called Salting-Out.
After salting out or phenol-chloroform extracting the samples, the DNA must be concentrated to a
considerably smaller volume in order to be useful in laboratory analyses. This is usually achieved by
precipitation in either two volumes of Ethanol or one volume of Iso-propanol, and centrifugation to
collect the resulting pellet, which is then re-suspended in sterile water or an appropriate storage buffer
such as Tris-EDTA (TE).
Both phenol-chloroform and salting out procedures give DNA extracts of sufficient quality for most
laboratory requirements. However, although the phenol-chloroform procedure is longer and requires
the use of corrosive solvents, it does yield a higher level of DNA purity.
33
34
DNA quality control
The molecular weight of the extracted DNA can be determined by electrophoresing the sample on a
0.5% agarose gel. A high molecular weight DNA ladder should be included in order to determine the
approximate molecular weight of the extracted DNA. Where possible, the molecular weight should be
as high as possible, as some types of laboratory analyses require high molecular weight DNA.
The concentration of the DNA in each individual sample is best determined by measuring the
absorbance at 260nm. By measuring the absorbance at 280 nm, and calculating the ratio of the
absorbances at 260/280 nm, it is possible to determine the degree of purity of the extraction. A good
quality DNA extraction should have a ratio of Abs260/Abs280 = 1.8. Lower ratios may be indicative of
protein contamination, but in practice Abs260/Abs280 > 1.6 is usually sufficient.
Minisatellites and Microsatellites
These types of molecular markers are composed of core sequences (also called mini- or microsatellite
motifs) which are repeated tandemly. The differences between different alleles at any locus are due to
the number of times the core sequence is repeated. For a microsatellite, the core sequence is 2 – 6 b.p.
(base pairs) long. For example, the human microsatellite HumF13b contains the core sequence
(TTTA)n repeated several times. The core sequence for a minisatellite, on the other hand, is usually
much longer (20 – 30 b.p.), and is often not perfectly repeated.
Initially, minisatellite loci were used in multi-locus profiles which could be used to determine
individual specific DNA fingerprints from Southern blots. However, while this technique was applied
for individual identification purposes, it is not possible to determine which fragments on a Southern
blot belong to which locus, and so little about population structure (heterozygosity, random mating,
etc) can be determined.
This problem was overcome by designing Southern blot probes which contain the minisatellite motif
plus some of the flanking sequence. Thus, a single locus minisatellite profile is obtained, and existing
statistical methods can be used to gain information about the populations under study.
The major difficulty with Southern blot techniques is that they require large amounts (>1 g) of high
molecular weight (<20 kb) DNA. This problem is solved by using PCR based methods, for which 20
ng (0.02 g) DNA are often sufficient. The DNA used need not be of high molecular weight for PCR
assays.
In the PCR assay, primers which are specific for the region of DNA at either side of the mini- or
microsatellite locus are used to obtain PCR products, the sizes of which can be determined by
electrophoresis.
The isolation and characterisation of new mini- and microsatellite DNA loci is time consuming.
Several methods have been described for this (see “Core Reading”).
Mitochondrial DNA
mtDNA, a maternally inherited closed circular molecule, was among the earliest DNA markers used
in the study of wild populations. Initially, purified mtDNA was subjected to digestion by restriction
enzymes, and the resulting restriction fragments were converted to restriction maps after
34
35
electrophoresis (RFLP). With the advent of PCR, it became possible to amplify mitochondrial
segments from total cellular DNA, and subject these PCR products to restriction digestion or to direct
sequencing. Several new approaches, such as mismatch distribution analysis, have allowed
researchers to gain information on population history.
Core reading
Avise, J.C. (1994) Molecular markers, Natural History and Evolution. Chapman and Hall, New York.
(Chapter 3, Molecular Tools, and Chapter 4, Interpretive tools.)
O’ Connell, M. and Wright, J.M. (1997) Microsatellite DNA in fishes. Reviews in fish biology and
fisheries, 7: 331 – 367.
Additional references
Avise, J.C. (1994) Molecular markers, Natural History and Evolution. Chapman and Hall, New York.
(Chapter 9, Conservation Genetics).
Allendorf F.W. and Seeb, L.W. (2000) Concordance of genetic divergence among sockeye salmon
populations at allozyme, nuclear DNA and mitochondrial DNA markers. Evolution 54: 640 – 651.
Carvalho G.R. and Pitcher A.J. Eds. (1995) Molecular genetics in fisheries. Chapman and Hall,
London.
De Woody, J.A. and Avise, J.C. (2000) Microsatellite variation in marine, freshwater and anadromous
fishes compared with other animals. Journal of Fish Biology 56: 461 – 473.
Hewitt, G. (2000) The genetic legacy of the Quaternary Ice Ages. Nature 405: 907 – 913.
Keller, L. and Ross, K.G. (1998) Selfish genes: a green beard in the red fire ant. Nature 394: 573 –
575.
Poinar, H.N. (1999) DNA from fossils: the past and the future. Acta Pædiatr Suppl 433: 133 - 140.
Schneider P.M., Seo, Y. And Rittner, C. (1999) Forensic mtDNA hair analysis excludes a dog from
having caused a traffic accident. International Journal of Legal Medicine 112: 315 – 316.
Stoneking, M. (1994) Mitochondrial DNA and Human Evolution. Journal of Bioenergetics and
Biomembranes 26: 251 - 259.
35
36
DNA Extraction Protocol
I. Tissue digestion
Proteinase K Buffer:
0.5M EDTA (ph 8.0)
Sodium sarcosyl [10%]
1M Trisma Base.HCl
Distilled deionized H2O
10ml
2.5ml
0.5ml
up to 50ml
1. Label a 2ml sterile tube and add 0.8ml Proteinase K buffer.
2. Cut approximately 0.5cm3 of tissue (gill tissue in ethanol) and eliminate the ethanol by
compressing the piece of tissue between two pieces of absorbent papers. Put the dry tissue into the
labelled tube with the buffer.
3. Add 4l of proteinase K (20mg/ml) and incubate at 50C over night.
4. Check if the tissue is well dissolved. If not add 2l of proteinase K (20mg/ml) and incubate for 1h
at 50C.
5. Add 10l of RNAse (10mg/ml) and incubate at 37C for 1-2h.
II. Phenol-chloroform extraction
1. Add 0.4ml of phenol and 0.4 ml of chloroform/isoamyl alcohol (24:1). Mix gently and place on a
rotating platform for 15min to 1 hour.
2. Centrifuge the sample for 15 min. at maximum speed.
3. Transfer approximately 0.6ml of aqueous phase into new labelled sterilised tube, taking care not
to disturb the inter-phase.
4. Add 2-2.5 volumes of pure ethanol (-20C) and shake gently so that the DNA precipitates and
falls to the bottom as a “stringy” pellet.
5. Replace the solution with 70% ethanol and place on a rotating platform over night to remove salts
from the DNA preparations.
6. Remove the ethanol and allow the pellet to dry, as ethanol can interfere with the following
analysis (i.e. PCR).
7. Re-suspend the DNA in 50l of TE buffer (10Mm Tris, 0.1Mm EDTA). Gently agitate the tube to
aid re-suspension of the pellet. Allow the DNA to re-suspend at +4C for at least 24 hours before
assessing its quality and concentration.
Agarose minigel electrophoresis (check molecular weight of DNA)
Check the quality of DNA running 1l of re-suspended DNA solution, 1l of 6x Loading Dye
(stock solution 10X Loading Dye: 30% ficoll, 100mM EDTA 0.4% Bromophenol Blue, 0.4%
Xylene Cyanol) and 4L sterile water on a 0.5% agarose gel in 0.5x TBE buffer containing
5l of ethidium bromide (10mg/ml) per 100ml.
TBE buffer
5x
Trisma Base
54g
Boric Acid
0.5M EDTA (pH8.0)
27.5g
20ml
36
37
Apply a constant voltage of 40V for 20 minutes. The DNA should appear as a concentrated band
close to the origin, indicating that only high molecular weight DNA is present.
Spectrophotometric determination of DNA concentration
The DNA concentration is calculated by its optical-density (O.D) at 260nm. Dilute 10l of resuspended DNA in 1ml of distilled water. Make sure the solution is well mixed, add it to a quartz
cuvette and insert the cuvette into the spectrophotometer. An O.D. value of A260 = 1 corresponds to a
DNA concentration of 50g/ml. The absorbance value at 260nm (which is the absorbance maximum
of DNA) is then used in the following formula to estimate the concentration of DNA in the original
tube:
Concentration DNA= A260 x 50 x 100 = g/l
1000
which is equal to A260 x 5 = Concentration (g/l)
POLYMERASE CHAIN REACTION PROTOCOL
1. Thaw the PCR reagents (which are kept frozen at -20C) and the DNA template.
Stock solution
PCR Ingredients
Reaction buffer IV (Advanced BiotechnologiesTM)
Magnesium Chloride (MgCl2) (Advanced BiotechnologiesTM)
Deoxynucleotide Triphosphates (dNTPs)(Pharmacia)
Forward Primer
Reverse Primer
Distilled deionized H2O
10X
25 mM
1.25 mM
20M
20M
-
2. Prepare a solution of mastermix:
Ingredients
Buffer
MgCl2
dNTP
Primer F.
Primer B.
Distilled H2O
TOTAL
Volume X1 sample
Final concentration
2l
1.6l
4.0l
1l
1l
9.2l
____
19l
1X
2 mM
0.25 mM
1M
1M
-
3. Store the Mastermix solution at +4C.
4. Add 1l of DNA (200g/l) to each tube.
5. Add 1unit of Taq Polymerase per sample (stock solution 5U/l, 1U=0.2l) to the mastermix.
37
38
6. Add the correct amount of mastermix to each tube (in this case 19l to bring the total volume to
20l). Be very careful not to cross contaminate the samples. Make sure that the DNA template and the
mastermix are properly mixed.
7. Add 20l of mineral oil to each tube to prevent evaporation during the reaction.
8. Insert the tubes into the PCR thermo cycler and start the program with an initial denaturation step
of 5 minutes at 95C followed by 30 cycles of 95C for 1 minute, 60C for one minute and 72C for 2
minutes. Finally a single cycle of 72C for 5 minutes ensures that all fragments are fully elongated.
Analysis of PCR Products on Agarose Gel (minisatellites)
Gel moulding
1. Make up 300 ml of 1% Agarose Gel. Dissolve the agarose in 1XTBE buffer by heating the solution.
When the agarose is completely dissolved, to avoid distorting the casting tray, allow the solution to
cool for 15-20 minutes at room temperature.
2. Before pouring the gel solution into a 20x30x0.6cm casting tray, add 5l of Ethidium
Bromide, mix thoroughly and then pour it into the tray.
3. Make sure that there are no bubbles and insert the comb which forms the wells. Let the gel
solidify for 1 hour. CAUTION: Ethidium bromide is mutagenic and should always be handled
with care. Wear gloves.
Electrophoresis
1. Remove the comb from the gel. Put the gel into the electrophoretic apparatus and cover it
with 2L of 1 x TBE Buffer.
2. Add 2l of 6x Loading Dye to 10l of PCR products and apply the samples to the gel. Load
8l of molecular weight ladder to allow determination of allele sizes.
3. Apply a constant voltage of 70V for 14 hours
4. When the electrophoresis is complete, place the gel on a UV transluminator and photograph
it.
Analysis of PCR Products (microsatellite) on an Automatic DNA Sequencer (Li-Cor).
For this system, it is necessary that one of the primers is labelled with a fluorescent dye, which is
detected by a laser. The DNA is denatured by heating and kept denatured by the addition of
formamide to the loading dye.
Preparation of gel plates
1. Clean the two gel plates very carefully both with distilled water and ethanol to remove all
dust particles which might cause air bubbles.
38
39
2. Insert a 0.25mm spacer on each side between the two plates and bind them with the clamps (leave
the top clamp open to insert the top spacer).
3. Position the gel plates at a shallow angle, approximately 35 to help to pour the gel.
4. Make up the polyacrylamide gel solution in a 50ml beaker :
Ultrapure Urea
distilled water
5X TBE Buffer
RapidGelTM-XL-40% Concentrate
10.5g
10ml
5ml
2.8ml
Place parafilm over the mouth of the container and mix thoroughly until all urea has
dissolved.
5. Add 25l TEMED and 175l of Ammonium Persulphate solution. Mix thoroughly. Draw
the mixture (approx. 20ml of solution) into a pipette or a syringe.
6. Pour the solution between the two plates. Check that no bubbles are present, and if so remove them
with a thin wire.
7. Apply the top spacer (to leave the space for the comb) and insert the casting plate, tighten the top
clamps and leave the gel to polymerise for at least 1 hour (not more than 2 hours because it will dry
out).
Preparation of the gel for loading
1 When the gel has solidified, remove the casting plate and the top spacer. Remove the excess
polyacrylamide with distilled water and a piece of paper.
2. Place the gel plates onto the DNA sequencer. Insert the buffer chamber and tighten the
clamps.
3 Pour 500ml of 1XTBE Buffer into each buffer tray. Clear the loading edge of excess
polyacrylamide by rinsing it with a Pasteur pipette. (to see the loading edge more easily, place
a silver surface between the gel plates and the sequencer).
4. Put the lids on each buffer tray and connect the circuit from the gel plates to the automatic
sequencer. Close the interlock.
5. Create a new file for the data on the hard drive of the automatic sequencer. Open the program (data
collection). Create a new directory (file, new, create). Set the voltage to 1200V, the current to 50A
and the power to 50W. Turn on the scanner. Click enter and check that the circuit is closed.
6. Pre-run the gel for at least 20 min to ensure that it is adequately heated and prepared before
loading the samples.
Preparation of the samples and gel loading
1. Pipette 1l of each PCR product into an individual 0.5ml sterile microcentrifuge tube.
2. Add 2l of Loading Dye - Formamide ACS Reagent solution (Loading Dye with
Formamide) to each tube. Centrifuge at low speed for 4 seconds to make sure that the PCR
products and the Formamide solutions mix together .
3 Heat each tube to 85C for 60 seconds on a PCR thermal cycler denature the PCR products.
4. Turn the machine off, open the interlock. Remove the top buffer trays lid.
39
40
5. Gently insert a 48- or 64- well shark-tooth comb (depending on the number of samples)
between the two plates until the tips of the comb are approximately 1 millimetre into the gel.
Tighten the top clamps.
6. Load 0.5l of PCR products-Formamide solution into each well of the loading comb. Load
0.5l fluorescently labelled size ladder to provide a consistent identification of the molecular
weight of the alleles separated on the gel.
8. Replace the lid. Close the interlock. Set the auto gain (options, auto gain, auto). Focus the gel
(scanner control, options, focalising, auto) (check that the curve is approx. like a normal
distribution). Set auto gain again. Start electrophoresis. Electrophoresis and detection of PCR
products of up to 400 b.p. should take approximately 2 – 3 hours.
Stuttering.
This is an artifact of PCR, where DNA fragments which are one or two repeat units shorter than the
true allelic fragment are produced. This is thought to be due to replicative slippage during the PCR
reaction. Stuttering is particularly pronounced in di-nucleotide microsatellites.
a.
b.
A. Li-Cor gel showing amplification products of the di-nucleotide microsatellite locus BW7, which
exhibits some stuttering. B. Li-Cor gel showing amplification products of the di-nucleotide
microsatellite locus BW9, showing considerably more stuttering. Di-nucleotides, particularly when
the core sequence is repeated very many times, are particularly prone to stuttering.
oooooooooooooOOOOOOOOOOOOOoooooooooooooo
40
41
APPENDICES
41
42
HINTS ON SOFTWARE FOR STATISTICAL TESTS AND
GENETIC VARIABILITY
(J. Mork, TBS)
NB! Be aware of the ADDITIVE properties of the chi-square statistic (chi-square and degrees of freedom from
several tests may be pooled for a stronger, overall test).
Also remember that in all types of chi-square tests, the expected value in a cell should not be less than 5, at least
not in more than 20% of the cells of a test. If this assumption is not fullfilled, one remedy is to pool cells or
alleles, or to use Monte Carlo based so-called exact tests (see «Zaykin-tests» below.)
Two type of chi-square tests are very commonly used in population genetics. The first type is the «Goodness-offit» test, which is used to test if the observed genotypic proportions at a locus is in correspondence with the
expected values assuming Hardy-Weinberg equilibrium (the expected values are thus calculated by means of the
sample allele frequencies using the binomial (or multinomial) formula (p+q...) 2.
The second type is the chi-square contingency table test (abbreviated RxC (Rows by Columns) test of
homogeneity (homogeneity is more correct to say than heterogeneity because the null hypothesis is that the
samples are drawn from the same population and thus expected to be homogeneous).
Test for Goodness-of-fit to Hardy-Weinberg expectations
[ Programmes: HWEQ2.EXE (Chi-square, two-allel loci) ]
[ ZHIHW.EXE, exact test; multi-allel loci ]
Chi-square test (HWEQ2.EXE):
Assume a sample of 100 diploid individuals. Electrophoretic analysis of an enzyme has revealed 3 different
patterns which is interpreted as genotypes formed by the combination of two alleles called 100 and 70 based on
the electrophoretic mobilities of their products. Thus the genotypes are: 100/100, 100/70, and 70/70. The
number of the different genotypes are tabulated, and the allele frequencies are calculated by summing their
numbers in homo-and heterozygotes. The following table can be set up (the letter q mean «frequency of»):
Observed
Expected (H-W)
chi-square
100/100
38
(36)
Genotypes
100/70
44
(48)
70/70
18
(16)
N
q*100
q*70
100
0.60
0.40
Expected (H-W) values are calculated from the binomial formula (a+b)(a+b)=a 2 + 2ab + b2
Genotype 100/100 = (0.6)2 * 100 = 36. Genotype 100/70: = (2*0.6*0.4) * 100 = 48, etc
chi-square = Sum of [ (Observed - Expected)2 / Expected] from each cell.
In this case chi-square = [(38-36)2 / 36] + [(44-48)2 / 48] + [8(18-16)2 / 16] = 0.694.
Degrees of freedom = [Number of different genotypes minus no. of different alleles ] = 3-2 = 1.
The significance level P corresponding to the calculated chi-square and degrees of freedom is looked up in a chisquare table (e.g. in Sokal & Rohlf: Biometry), or checked with a suitable computer program.
In this example, P=0.405. which is much higher than the P=0.05 rejection level which is commonly used in
biology. We therefore do not reject the nullhypothesis, which is that the sample can have been drawn from a
population in Hardy-Weinberg equilibrium for the locus under study.
42
43
P gives the probability that one may encounter, by chance alone, a deviation between estimated and observed
genotypic proportion as large or larger than the one actually observed in the sample, if the sample came from a
population in Hardy-Weinberg equilibrium. A chi-square value corresponding to P=0.05 is expected to be
encountered in one out of 20 cases when sampling from one and the same H-W population.
Wahlund-effect: A deviation from expected values is in form of a deficiency of heterozygotes relative to
expectations is indicative of population mixing (i.e. the sample contains a physical mixture from two or more
populations with different allele frequencies at the locus under study). The nature of this phenomenon is easily
seen by joining two equally sized subsamples where one has only the 100/100 genotype and the other only the
70/70 genotype. Their joint allele frequency would be 0.5 and we would therefore expect half of the individuals
to be heterozygotes whereas none will be present in the mixed sample.
«Zaykin-test» for Hardy-Weinberg proprtions (programme ZHIHW.EXE):
If the assumption of >5 as expected value in cells is not met, so-called exact test may be used (e.g. Zaykin &
Pudovkin’s ZHIHW:EXE). Such tests are based on Monte Carlo simulations. For the software provided on BI
315, on-screen documentation will appear by typing the programme’s name. Data can be loaded from a text file
or typed in during programme execution.
Tests for inter-sample homogeneity (RxC contingency table tests)
[ Programmes: CHIRXC.EXE (Chi-square) ]
[ ZHRXC.EXE (exact test)]
1. CHIRXC.EXE (Chi-square):
This test is used to test if the proportion of genotypes (or alleles ) is similar in different sub-groups of the
materials. «Sub-groups» can be e.g. samples from different locations, sex groups, age groups etc.
The chi-square value is, as always, calculated by squaring the difference between observed and expected value
and dividing the result with the expected value. The result from this process in each cell is summed into a total
chi-square. The number of degrees of freedom is generally calculated as (R-1)(C-1).
The «expected» number in a cell is calculated based on the following line of reasoning:
The null hypothesis is that we have a number of samples taken from the same population. If so, the best estimate
we can have of the true distribution in the population is given by the largest sample we have, and that is the sum
of all the samples. We therefore use the proportions of «types» in the «Total» as a template for calculating the
expected values for each «Location» in the table below.
Thereafter, we test if the difference between observed and expected proportions is too large to be caused by
chance alone (e.g. at the 5% significance level).
RxC test of genotypic proportions:
Location 1
Location 2
Total
100/100
38 (24)
34 (48)
72
Genotypes
100/70
44 (45.3)
92 (90.7)
136
70/70
18 (30.6)
74 (61.4)
92
N
100
200
300
Example: expected value in cell «100/100 for Location 1»: (72/300*100) = 24
Chi-square = 20.157, DF = (2-1)(3-1) = 2, P = 0.00004, i.e. we reject the null hypothesis. The two samples are
so different in genotypic proportions that it is very unlikely that they are drawn from the same population.
Usually, we can make a more powerful test by converting the genotypic proportions to allelic proportions
(counting two alleles of the same kind in homozygotes and one allele of each type in the heterozygote). The
higher power is because the degrees of freedom is lower for alleles than for genotypes. The genotype proportions
in the table above will give the following allelic proportions (table below):
RxC test of allelic proportions (allele counts from the genotypic numbers in the table above):
Allele
43
44
100
70
Location 1
120 (112)
80 (128)
Location 2
160 (168)
240 (192)
Total
280
320
Chi-square = 21.429, DF = (2-1)(2-1) = 1, P = 0.000004, i.e. the null hypothesis is rejected.
N
200
400
600
2. Exact test «Zaykin-test» for RxC tables (programme ZHIRXC.EXE):
If the assumption of >5 is not met in the cells of the RxC table, the Monte Carlo type exact test provided by the
computer program ZHIRXC.EXE can remedy the problem. Instruction for the use of the program is given onscreen by typing in the programme name on the computer.
Nomenclature conventions:
Loci, genotypes and alleles are written in italics. Recommended abbreviations for enzymes and enzyme loci can
be found in Shacklee et al. (1990). For example, the gene coding for the enzyme LDH-3 is called LDH-3*, the
most common allele is called LDH-3*100, and the homozygote for this allele is called LDH-3*100/100.
Reference:
Shacklee, J.B., F.W. Allendorf, D.C. Morizot, and G.S. Whitt. 1990. Gene nomenclature for proteincoding loci in fish. Transactions of the American Fisheries Society 119: 2-15.
Software for calculation of Nei’s I (genetic Identity) and D (genetic
Distance)
(J. Mork, TBS)
The following allele frequencies at two loci in samples from two locations have been calculated from observed
genotypic proportions in the samples:
LDH-3*
Location 1
Location 2
100
0.60
0.70
HbI*
70
0.40
0.30
100
0.90
1.00
80
0.10
0.00
Since we have only two samples here, we may use the program GSA.EXE for calculating Genetic Distance. (The
genetic Identity can be found from the relation D = -ln(I)). When more than two samples, the temporary (*.TMP)
files generated by the program DG25B.EXE can be used to find all pairwise genetic distances.
Both these programs use an infile (DOS text file) with the following format (NB! the cursivated comment are not
part of file):
2 2 0 2 (no. of samples - no. of polymorphic loci - no. of monomorphic loci - largest no. of alleles at any locus)
22
(no. of alleles at loci in succession from left to right in tablethe infile)
«A1» 0.60 0.40 0.90 0.10 (Name of sample (alphanumeric and in brackets. Use codes A1-A9, B1-B9 etc)- allele frequencies)
«B2» 0.70 0.30 1.00 0.00
We may call this file e.g. Bih97.txt, and call it during execution of GSA.EXE. The output from GSA.EXE will
look like this:
44
45
Output includes the
calculated Gst (= the
average Fst for two
loci), which tells that ca
1.8% (fraction 0.0182)
of the total genetic
variation in the material
is due to genetic
differences between
samples..
Further,GSA.EXE provides values for Genetic Distance. Since we have only two samples here, the minimum and
the maximum D-value will be the same (namely 0.0104). In an UPGMA dendrogram this will be the distance
from D=0 to the point where the two legs of the bifurcation join on the Distance axix.
It is not meaningful to do cluster-analysis and dendrogram drawing with only two samples, but if we do that
anyhow (using Bih97.txt as infile for DG25B.EXE), the dendrogram will look like this:
oooooooooOOOOOoooooooooo
45
46
MEASUREMENTS OF SIMILARITIES AND DISTANCES
(Bjørn Ivar Honne, Planteforsk)
Genetic distance according to EDWARDS (1971):
Measurement of distance with allele frequencies p1 ,q1 and p2 ,q2 in two populations:
Any locus with two alleles in any population may be pictured on a quarter-circle with radius 1, and
coordinates sqroot of p and q, respectively. The angle between the two radii is and the distance 2d
is measured along the secant connecting the two points (populations), where:
d  1  cos( )
and d 2  1 
p1 p2  q1 q 2
Further, if the allele frequencies are not too different:
2d 2  FST
According to NEI (1972, -75, -77):
Designate the frequencies of multiple alleles at one locus in a population X as pi , and the frequencies
of the allelomorphs in another population Y as qi .
Then let
Jxx = pi2 , and JYY = qi2 .
These are the probabilities that two alleles taken at random in population X and Y respectively, are
identical in function (not necessarily identical by descent). Now, when one allele is randomly chosen
from population X and the other from Y, the probability that they are identical (by function) is
JXY =  pi qi
(summed over all identical pairs at that locus)
Then Nei’s normalized identity, I, for this locus in X and Y is:
I
J XY
J XX J YY
, and the standard genetic distance, D, is: D = - ln(I)
For multiple loci the respective J’s are arithmetic means over loci. (NB! monomorphic loci, if
encountered, are included in the analysis).
If the alleles considered are selcetively neutral, then Nei’s normalized identity, I, change linearly with
time.
46
47
The genotypic identity and distance according to HEDRICK (1971)
Given that populations deviate from H-W proportions, the genotype frequencies may be an interesting
alternative to allele frequencies to measure identities/distances. Hedrick uses a parallel development
to Nei’s reasoning but based on genotype frequencies.
The standardized identity according to Hedrick, i.e. based on genotype frequencies, is:
I
2I XY
,
I X  IY
where
n
I XY   Pij , X Pij ,Y
,
i j
n
I X   Pij2, X ,
and
i j
n
I Y   Pij2,Y
i j
Pij,X and Pij,Y are the frequencies of genotype ij in population X and Y , respectively; and the
summations are over all genotypes. The genetic distance is the complement of I, or:
D=1-I.
Extension to multiple loci may be done by averaging I and D over loci.
The probability of a unique genotype according to HEDRICK (1971)
Important in species comparisons and other situations is the presence or absence of particular
genotypes or - alleles. To focus on this aspect, the probability of a unique genotype is introduced. This
measure has two components:
n
U x y   Pijx , where Pij.y = 0
i j
and
n
U yx   Pij y , where Pij.x = 0
i j
the first component is the probability of a genotype occuring in population X and not in Y; the second
is the probability of a genotype occuring in Y and not in X. U ranges from 0 to 1. Extension to
multilocus situations is done by calculationg the arithmetic average over all loci.
47
48
SIMILARITIES AND DISTANCES FOR RAPD BAND
PATTERNS USING JACCARD’S DICHOTOMY
COEFFICIENTS:
Analysis of data from Avena sterilis RAPDs.
Bjørn Ivar Honne, Planteforsk
Data provided by Prof. dr. Manfred Heun.
The data are from 177 RAPD bands generated with 20 primers in 24 genotypes of Avena sterilis. The
raw data are scores of presence and intensity of bands. Scores 1, 2, 3 = band present and with
icreasing intensity, score 0 = no band where at least one of the other genotypes has produced a band.
(The raw data are given at the end of this chapter).
From these data we want to measure similarities or dissimilarities among the 24 genotypes (lines), and
so use these measures to construct a form of tree where those genotypes which are similar cluster
closely together, and those less similar join up at increasing distances as the dissimilarities increases.
We will return to the tree construction later, let us first decide how to measure similarities.
We could use the raw data directly and measure similarities by Goodman-Kruskal’s gamma
coefficient among the genotypes. (The gamma coefficient is suited for discontinuous/discrete data).
However, this is not quite suitable here. First, with this measure, a score of 0 in two genotypes would
contribute to similarity among those two genotypes, whereas the mutual lack of band where other
genotypes have a band is no positive confirmation that the two ceros are similar. They may fail to
produce the band for completely different reasons (DNA base compositions). The intensity of bands is
another difficulty. We don’t know the reproducibility here, and if not highly reproducible, genotypes
with same bands present but with different intensities would be assumed/measured less similar than
those with same intensity. So the only hard evidence we have got is presence or absence of bands.
Thus the data should be dichotomized, i.e. represented by 0 or 1 only, (or any other pair of digits).
This can be achieved by retaining all 0 (cero) as 0, and replacing all values greater than 0 with 1.
When our raw data are dichotomized, we may compare all possible pairs of genotypes (i,j) by
presence (1) or absence (0) of each of the 177 possible bands (x), and compute a matrix of similarity
coefficients. As mentioned above, we would count as similarity when two genotypes produce the
same band, as dissimilarity when the one genotype produces a band and the other not, and exclude as
not informative the cases where both genotypes lack a possible band.
Referring to the table below, xi and xj represent the possible bands in two genotypes/lines. For each
band position, (bands identified by primer used and fragment length), there is either a 0 or 1 in each
genotype; a is the number of cases with same band in both, b is band in j not in i , c is band in i not
in j, and d is no band in either where at least one fo the other genotypes investigated has produced a
band. The latter is the number of non informative cases when similarity is considered. So the total
number of bands in the two genotypes is a + b + c, of which a are similar bands.
48
49
xi
1
a
c
a+c
1
0
xj
0
b
d
b+d
a+b
c+d
So the similarity coefficient, Si,j , would be:
Si , j 
Mi , j
a
,

a bc
Ti  Tj  Mi , j


also called Jaccard´s dichotomy coefficients, (varying from 0 to 1). Above is also shown a notation
with Mi,j for the number of common bands, Ti for the total number of bands in i, and Tj for the
total number of bands in j.
A mesure of distance between the two genotypes, di,j , would be:
di,j = 1- Si,j ,
leading to distance = 0 for similarity 1 (all band similar, i.e. b = c = 0; Mi,j = Ti = Tj ), and
distance = 1 for similarity 0 (i.e. a = Ti,j = 0 and b and/or c > 0). The similarity matrix for all pairs
among the 24 genotypes are at the end of this chapter.
The distances as defined above, may the be used to cluster the genotypes in a tree structure. However,
trees can be constructed in several ways with one and same measure of distances between pairs of
genotypes, and different methods doesn´t (usually) produce similar trees.
Here we shall show two methods, the average method, (also called the UPGMA= the unweighted
pair group method with arithmetic mean), and the additive tree method. The UPGMA method and
others are briefly described in Hartl & Clark p. 378 etc. The reason for also showing the result of the
additive method, is that UPGMA and other hierarchial tree methods imply that all within cluster
distances are smaller than all between cluster distances and that within cluster distances are equal.
This («ultrametric») condition seldom applies to real similarity data. In an additive tree distances
between objects/genotypes are represented by the lengths of the branches connecting them.
Starting with the raw data file at the end of this chapter, dichotomizing the data, calculating
similarities and distances and constructing a tree, can be achieved with the following commands from
inside the program packet ‘SYSTAT’ (v. 6.0),( > marks the program prompt):
>corr
(evokes the module which calculates correlations,similarities etc.)
>use avenak
(name of the file with raw data)
>save avenas3
(naming the file where we save the similarity matrix)
>let (A1..D6) = @ <> 0
(dichotomizing the raw data)
>s3 A1..D6
(calculates similarities as Jaccard’s dichotomy coeff.)
>cluster
(swithces to the module for cluster analyses)
>use avenas3
(use the similarity matrix)
>join A1..D6 /linkage=average(,polar)
(the program recognises the file as a similarity matrix, uses 1-Si,j to calculate distances, constructs the
tree according to the ‘average’/UPGMA method and produces a polar plot if the option within
parenthesis is used).
Other statistical packages like SAS can also produce the same results (different commands).
49
50
Clusteranalysis of data from Avena sterilis. Distance used is 1-Jaccard’s
dichotomy coefficients, linkage=‘average’; upper graph cartesian plot, lower
graph polar plot.
50
51
Additive tree using Jaccard’s dichotomy coefficients on data from Avena
sterilis RAPD’s. NB! As given here the tree is unrooted.
51
52
Raw data from scoring 177 RAPD bands generated with 20 primers (A-01, A-02, ....., B-08) applied to
24 genotypes (A1, A2, ....., D6). Presence of bands are designated 1, 2 or 3 according to band
intensity, absence of band where at least one other genotype has produced a band is designated 0.
____________ Genotypes __________________
PRIM. A A A A A A B B B B B B C C C C C C D D D D D D
1 23 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
A-01
A-01
A-01
A-01
A-02
A-02
A-02
A-03
A-03
A-03
A-03
A-03
A-03
A-03
A-04
A-04
A-04
A-04
A-04
A-04
A-04
A-04
A-04
A-04
A-04
A-04
A-09
A-09
A-12
A-12
A-12
A-12
A-12
A-12
A-12
A-12
A-12
A-13
A-13
A-13
A-13
0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0.
1. 1. 1. 1. 1. 1. 0. 0. 0. 2. 0. 2. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.
2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 1. 2. 0. 2. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 0. 1. 2.
2. 2. 2. 0. 2. 1. 0. 0. 1. 0. 0. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 0. 1. 1.
1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 2. 2. 2. 2. 2. 2.
0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
2. 2. 2. 2. 2. 2. 1. 1. 1. 1. 1. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
1. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 2. 0. 1. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
2. 0. 2. 0. 2. 0. 0. 0. 2. 2. 0. 0. 0. 0. 0. 0. 2. 2. 0. 0. 0. 2. 0. 0.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
0. 0. 2. 0. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 1. 0. 2.
0. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 0. 2. 1. 0. 2. 2. 2. 2. 2. 2.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
2. 1. 1. 0. 0. 1. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 2. 2. 2. 2. 2. 2. 0. 0. 2. 2. 0. 2. 2. 2. 2. 1. 2. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 2. 0. 2. 2. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 2. 2. 2. 2. 2. 1. 1. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 2.
0. 1. 2. 0. 2. 2. 0. 0. 0. 2. 0. 0. 0. 2. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.
2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 3. 2. 2. 2. 2.
52
53
PRIM. A A A A A A B B B B B B C C C C C C D D D D D D
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
A-14 2. 2. 2. 2. 2. 2. 1. 1. 1. 1. 1. 1. 2. 2. 1. 1. 1. 2. 1. 2. 1. 1. 2. 2.
A-14 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0.
A-14 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-14 2. 1. 1. 0. 1. 1. 2. 2. 2. 2. 2. 2. 1. 1. 1. 0. 2. 2. 2. 2. 2. 2. 2. 2.
A-16 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 1. 0. 0. 0. 1. 1. 0. 0. 0. 1.
A-16 2. 0. 2. 2. 2. 0. 2. 2. 0. 2. 0. 2. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0.
A-16 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-16 0. 0. 2. 2. 0. 2. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
A-16 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-16 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 0. 2. 0. 2.
A-16 1. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 1. 2. 0. 0. 0. 0. 2. 0. 2. 0.
A-16 2. 2. 2. 2. 2. 2. 1. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
A-16 2. 2. 2. 0. 2. 2. 1. 1. 0. 0. 0. 1. 2. 2. 2. 2. 2. 2. 1. 2. 2. 1. 2. 2.
A-16 0. 0. 0. 2. 0. 0. 2. 0. 1. 2. 0. 1. 0. 0. 0. 0. 1. 0. 2. 0. 0. 0. 0. 2.
A-16 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-16 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 3. 0. 3. 0. 3.
A-16 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-17 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2.
A-17 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-17 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
A-17 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2.
A-17 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0.
A-17 1. 1. 1. 1. 0. 0. 2. 1. 0. 2. 0. 2. 2. 2. 1. 0. 1. 0. 1. 1. 1. 1. 1. 1.
A-17 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-17 0. 0. 0. 0. 0. 0. 2. 2. 0. 1. 2. 2. 2. 0. 2. 2. 1. 1. 1. 1. 1. 1. 1. 1.
A-17 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-18 0. 0. 0. 0. 0. 0. 2. 2. 1. 0. 2. 0. 0. 2. 2. 2. 0. 2. 2. 0. 2. 2. 2. 1.
A-18 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 0. 2.
A-18 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 0. 0. 0.
A-18 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-18 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 0. 2.
A-18 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
A-18 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-18 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-18 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
A-18 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2.
A-19 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-19 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2.
A-19 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-19 0. 0. 0. 0. 0. 0. 0. 0. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
A-19 2. 2. 0. 2. 2. 2. 2. 0. 2. 2. 2. 2. 0. 2. 0. 0. 1. 0. 2. 2. 2. 2. 2. 2.
A-19 2. 2. 0. 2. 0. 2. 0. 0. 0. 0. 0. 2. 2. 0. 2. 2. 1. 2. 2. 2. 2. 0. 2. 2.
A-19 2. 2. 2. 2. 2. 2. 1. 1. 1. 1. 1. 1. 2. 2. 2. 2. 1. 2. 2. 1. 1. 2. 2. 2.
A-19 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0.
A-19 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 0. 2. 2.
A-19 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-19 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
53
54
PRIM. A A A A A A B B B B B B C C C C C C D D D D D D
1 23 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
A-20 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-20 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2.
A-20 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 2. 0. 1. 1. 1. 1. 1. 2. 0. 0. 0. 2. 1.
A-20 0. 0. 0. 1. 0. 0. 2. 0. 2. 2. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
A-20 2. 2. 0. 2. 2. 2. 0. 1. 0. 0. 0. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
A-20 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
A-20 0. 0. 0. 0. 0. 0. 2. 2. 0. 2. 2. 2. 0. 1. 2. 0. 2. 2. 0. 0. 0. 0. 0. 0.
A-20 0. 0. 0. 2. 0. 0. 2. 2. 2. 2. 2. 2. 0. 0. 2. 2. 2. 2. 2. 0. 0. 2. 0. 2.
A-20 1. 1. 1. 0. 0. 1. 0. 0. 2. 0. 0. 0. 2. 0. 0. 0. 0. 1. 1. 2. 2. 1. 1. 1.
A-20 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0.
A-20 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 2. 2. 2. 0. 0. 2.
A-20 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 1. 1. 2. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 0.
A-20 0. 0. 2. 0. 0. 2. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2.
A-20 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 0.
A-20 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-01 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-01 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0. 0. 0. 2.
B-01 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0.
B-01 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-01 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
B-01 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-01 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
B-01 2. 2. 2. 0. 2. 2. 1. 1. 1. 1. 2. 2. 2. 0. 2. 2. 1. 2. 1. 0. 0. 1. 1. 2.
B-01 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 2. 2. 0. 0. 1. 2. 1. 1. 2. 1. 2. 1.
B-01 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-01 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 2. 0. 0. 0.
B-02 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0.
B-02 0. 1. 1. 0. 0. 0. 0. 0. 2. 2. 1. 2. 1. 1. 0. 1. 1. 0. 1. 0. 0. 2. 0. 2.
B-02 2. 2. 3. 3. 2. 2. 3. 3. 3. 2. 3. 2. 3. 2. 2. 3. 2. 2. 3. 2. 2. 3. 2. 3.
B-02 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-02 3. 3. 1. 0. 3. 3. 0. 0. 0. 0. 2. 2. 0. 0. 3. 0. 3. 0. 2. 1. 1. 2. 1. 2.
B-02 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 0. 0. 0. 0. 0. 0.
B-02 0. 0. 0. 0. 0. 0. 3. 2. 2. 2. 1. 2. 0. 0. 0. 0. 2. 0. 2. 2. 2. 2. 2. 2.
B-02 0. 0. 2. 2. 0. 0. 0. 2. 2. 1. 2. 2. 0. 0. 0. 2. 0. 0. 2. 2. 2. 2. 2. 2.
B-02 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 3. 0. 0. 0. 0. 0. 3. 3. 0. 3. 0.
B-02 1. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 2. 0. 0. 0. 0. 0. 0. 0. 0.
B-02 2. 0. 0. 0. 0. 0. 2. 2. 2. 2. 0. 2. 0. 2. 0. 0. 1. 0. 2. 2. 2. 2. 2. 2.
B-03 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
B-03 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 2. 3. 3. 3. 3. 3.
B-03 0. 0. 2. 0. 0. 0. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0.
B-03 0. 2. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
B-03 2. 2. 2. 1. 2. 2. 0. 0. 2. 0. 2. 0. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 2.
B-03 0. 0. 0. 0. 0. 0. 0. 3. 3. 0. 3. 0. 0. 0. 0. 2. 0. 0. 0. 2. 0. 0. 0. 2.
B-03 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0.
B-03 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-03 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
54
55
PRIM. A A A A A A B B B B B B C C C C C C D D D D D D
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
B-04 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-04 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-04 2. 2. 2. 2. 2. 2. 2. 2. 1. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
B-04 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-04 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
B-04 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 2. 0. 0. 0. 0. 0. 2. 2. 2. 0. 0. 1. 0. 2.
B-04 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-04 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-05 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-05 0. 2. 2. 0. 2. 2. 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0.
B-05 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-05 0. 2. 0. 0. 0. 2. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-05 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
B-05 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-05 0. 0. 0. 0. 2. 0. 2. 2. 2. 0. 2. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
B-06 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-06 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-06 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 0. 2. 0. 0. 2. 0.
B-06 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2. 2.
B-06 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0.
B-06 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-06 0. 0. 0. 0. 0. 0. 2. 2. 2. 2. 2. 2. 0. 0. 0. 2. 0. 2. 0. 0. 0. 2. 2. 0.
B-06 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-07 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 2. 0.
B-07 0. 0. 1. 1. 0. 0. 2. 2. 0. 0. 0. 0. 0. 2. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0.
B-07 2. 2. 2. 3. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
B-07 0. 0. 0. 0. 0. 0. 2. 2. 0. 2. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 1.
B-07 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-07 0. 0. 0. 3. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 3. 3. 3. 3. 3. 0.
B-07 1. 2. 3. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 3. 2. 2. 2. 3. 2. 2. 2. 2. 2. 2.
B-07 0. 0. 2. 2. 2. 2. 2. 2. 0. 2. 0. 0. 2. 2. 0. 1. 1. 0. 0. 2. 2. 0. 2. 2.
B-08 2. 0. 2. 0. 2. 3. 0. 2. 2. 0. 2. 2. 3. 2. 3. 0. 0. 3. 2. 0. 0. 0. 2. 0.
B-08 2. 0. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 0. 0. 2. 2. 2. 2. 2. 0. 3. 2. 3. 3.
B-08 0. 2. 2. 2. 2. 2. 0. 0. 0. 0. 1. 0. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
B-08 2. 0. 2. 0. 0. 2. 0. 0. 2. 0. 2. 0. 0. 2. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0.
B-08 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 1. 0. 2. 2. 2. 2.
B-08 0. 0. 0. 0. 2. 0. 2. 2. 2. 2. 0. 0. 0. 2. 0. 2. 2. 0. 0. 0. 0. 0. 0. 0.
B-08 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
B-08 2. 0. 2. 2. 2. 0. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0.
B-08 0. 3. 0. 3. 0. 0. 3. 3. 3. 3. 3. 0. 0. 0. 0. 3. 0. 3. 3. 0. 0. 3. 0. 0.
B-08 2. 2. 2. 2. 2. 2. 0. 0. 0. 0. 0. 0. 1. 2. 2. 2. 2. 2. 2. 2. 0. 2. 2. 2.
B-08 0. 2. 2. 2. 2. 2. 0. 0. 0. 0. 2. 0. 2. 2. 2. 1. 2. 2. 2. 2. 2. 2. 2. 2.
B-08 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 1. 0. 1. 1. 0. 0.
55
56
Similarity matrix with Jaccard’s dichotomy coefficient for the Avena data
A1
A2
A3
A4
A5
A6
B1
B2
B3
B4
B5
B6
C1
C2
C3
C4
C5
C6
D1
D2
D3
D4
D5
D6
C1
C2
C3
C4
C5
C6
D1
D2
D3
D4
D5
D6
A1
A2
A3
A4
A5
A6
B1
B2
1.0000
0.8250
0.8065
0.7200
0.8099
0.8049
0.6667
0.6818
0.6815
0.6593
0.6692
0.7068
0.7705
0.7559
0.8250
0.7231
0.7463
0.7742
0.6884
0.7099
0.7442
0.6765
0.7578
0.7090
1.0000
0.7840
0.7541
0.8167
0.8729
0.6350
0.6370
0.6377
0.6397
0.6870
0.6741
0.8067
0.7760
0.8017
0.7559
0.7786
0.7520
0.6934
0.7287
0.7500
0.6815
0.7638
0.7273
1.0000
0.7381
0.8279
0.8226
0.6357
0.6741
0.6619
0.6642
0.6866
0.6619
0.7886
0.7874
0.7840
0.7273
0.7630
0.7364
0.6573
0.6642
0.6838
0.6454
0.6963
0.6763
1.0000
0.7541
0.7360
0.6541
0.6565
0.6204
0.6591
0.6692
0.6324
0.7438
0.7440
0.7258
0.7520
0.7348
0.7063
0.6642
0.6718
0.6667
0.6277
0.6667
0.6716
1.0000
0.8729
0.6593
0.6870
0.6618
0.6642
0.6870
0.6496
0.8067
0.8049
0.8017
0.7559
0.8203
0.7951
0.6571
0.6767
0.6970
0.6569
0.7231
0.6889
1.0000
0.6214
0.6471
0.6475
0.6259
0.6970
0.6594
0.8167
0.8000
0.7967
0.7385
0.7879
0.7619
0.6786
0.7252
0.7462
0.6547
0.7597
0.7239
1.0000
0.8760
0.7846
0.8770
0.7597
0.7846
0.6493
0.6889
0.6842
0.6715
0.6950
0.6544
0.6528
0.5903
0.6312
0.6525
0.6788
0.6714
1.0000
0.7891
0.8080
0.7920
0.7891
0.6899
0.7045
0.7132
0.7121
0.6978
0.6947
0.6906
0.6377
0.6569
0.7037
0.7197
0.7111
C1
C2
C4
C5
1.0000
0.7951
0.8534
0.7600
0.7692
0.8000
0.7090
0.7323
0.7266
0.6970
0.7823
0.7442
1.0000
0.8049
0.7462
0.8372
0.7559
0.6738
0.7068
0.7273
0.6500
0.7405
0.7059
C3
1.0000
0.8279
0.8062
0.8559
0.7313
0.7023
0.7364
0.6815
0.7778
0.7538
1.0000.
0.7630 1.0000
0.7778 0.7727
0.6929 0.7273
0.6642 0.7000
0.6963 0.7194
0.6691 0.7042
0.7348 0.7445
0.7388 0.7737
C6
D1
1.0000
0.7259
0.6593
0.7045
0.7015
0.7442
0.7348
D2
1.0000
0.7687
0.7895
0.8397
0.7895
0.8615
56
1.0000
0.8926
0.7313
0.8320
0.8346
B3
B4
1.0000
0.8047
0.8320
0.7463
0.6642
0.6912
0.6618
0.6861
0.6853
0.6815
0.6901
0.6154
0.6454
0.6786
0.6934
0.6978
D3
B5
1.0000
0.7656
0.8047
0.6541
0.6940
0.6642
0.6765
0.7122
0.6593
0.6573
0.5944
0.6241
0.6691
0.6715
0.6643
D4
B6
1.0000
0.7348
0.7031
0.7045
0.7266
0.7121
0.6978
0.7209
0.7029
0.6377
0.6569
0.6912
0.7068
0.6985
D5
1.0000
0.6767
0.6788
0.7252
0.6861
0.6972
0.6691
0.7021
0.6383
0.6934
0.6786
0.7444
0.6978
D6
1.0000
0.7652 1.0000
0.8699 0.7652 1.0000
0.8281 0.7955 0.8140 1.0000
57
DNA ANALYSIS TECHNIQUES
(Paul Galvin 1998)
Aquaculture Development Centre, Department of Zoology and Animal Ecology,
University College Cork, Ireland
Tel: +353 21 904053
Fax: +353 21 277922
E-mail: P.Galvin@UCC.IE
1. Electrophoresis of DNA
Unlike proteins, where at least part of the separation is the result of different charges (due to amino
acid substitutions), all DNA is negatively charged, so the DNA fragments migrate towards the
positive electrode. The distance that the fragments migrate is therefore primarily related to the size of
the fragments (smaller fragments move more quickly), although the secondary structure of the DNA
can also be important (as utilised in single stranded conformational polymorphism (SSCP) analysis)
(Hillis et al. 1996). The degree of "sieving" that is effected by the gel matrix is therefore of the
utmost importance.
Two types of gel matrix are commonly used: polyacrylamide and agarose
(Maniatis et al. 1982). Polyacrylamide gel electrophoresis is usually carried out on a vertical gel
apparatus, with concentrations of acrylamide ranging from 4 to 8%, depending on the range of
fragment sizes that need to be separated. The lower the percentage of acrylamide, the easier it is for
the DNA to pass through the gel matrix. Therefore, lower percentage acrylamide concentrations tend
to be used to separate larger fragments, while higher percentage concentrations reduce the speed of
migration for the smaller fragments, thus concentrating the fragments into sharp bands. Similarly
agarose gels can range from 0.5% to 5% w/v where the highest concentrations are used only for
fragments of < 300bp. While polyacrylamide gels tend to provide optimal resolution for smaller
fragments, improvements in the quality of available agarose (although high quality agarose tends to be
expensive), have enabled the use of agarose gels for separating microsatellite alleles. One important
factor influencing the choice is the characteristic neurotoxicity of acrylamide in its unpolymerised
state.
57
58
2. Visualisation of DNA
Following the separation of the fragments, visualisation of the fragments can be achieved in a number
of ways. Probably the simplest method (if sufficient quantity of the required type of DNA is present),
is to stain with ethidium bromide and view over a UV transilluminator. The simplicity of this method
makes it popular especially for routine assessment. However, ethidium bromide intercalates with the
DNA, making it a highly mutagenic substance, presenting a hazard in the laboratory. Its sensitivity is
also limited to approximately 10ng of DNA. DNA bands on a gel can also be revealed by silver
staining, and this is a valid alternative especially for polyacrylamide gels. Where the DNA of interest
occurs in very low concentrations, some form of tagging either with a radioactive or
chemiluminescent label is required for detection. When radioactive labels are used (e.g.
32
P), the
DNA is transferred from the gel to a nylon membrane (e.g. by Southern blotting) and a labelled probe
is hybridised to the membrane, which is then exposed to X-ray film.
Automated DNA sequencers combine laser technology, fluorescence detection and carefully
regulated polyacrylamide gel electrophoresis to enable automated sequence analysis and genotype
characterisation. This relatively expensive technology is fast becoming the method of choice for
detecting genetic variability within and among species (i.e. genotyping of microsatellite loci and
sequencing of mtDNA or transcribed sequences respectively) (Ziegle et al. 1992; Tully et al. 1993).
Not only does this approach avoid the need to use radio-active isotopes, it also lends itself to
automation and hence provides the possibility for a high throughput, enabling screening of larger
numbers of loci and populations than has been possible with manual methods. In addition, through
the use of image analysis software to characterise the genotypes, based on the mobility of the bands
relative to reference bands, genotype characterisation can be standardised among different researchers
within and between laboratories.
3. Restriction digestion of DNA
Restriction enzymes cut DNA at specific recognition sequences. Different restriction enzymes have
characteristic recognition sequences (restriction sites) of four, five or six base pairs, the length of
which affects the number of times that a fragment will be cut. That is, generally six-base cutters will
have fewer recognition sites in any given sequence and so will result in fewer fragments. Restriction
fragment length polymorphisms (RFLPs) result when a mutation changes a sequence such that it
either generates a new restriction site where there wasn't any previously, or results in the loss of an
existing restriction site. RFLPs are detected by digestion with a restriction enzyme, followed by
separation of the fragments of DNA (on an electrophoresis gel). This enables interpretation of the
pattern to determine where the restriction sites have occurred, revealing differences among
chromosomes due to the gain or loss of a restriction site.
58
59
4. The Polymerase Chain Reaction (PCR)
The PCR technique has revolutionised molecular biology (Saiki et al. 1988). It is now possible to
isolate sections of DNA by designing primers complimentary to the sequences at either side, and to
amplify up such sections of DNA to facilitate different forms of manipulation. The sensitivity of the
technique means that careful attention to the practice is required. Contamination with foreign DNA
of any of the solutions or materials connected to the reactions can result in the failure to amplify the
correct product, or the presence of artefact products. As a general rule, the results of any reaction
should be 100% reproducible and for any locus, the products should segregate according to
Mendelian expectations. By following these criteria, the risk of artefact bands being misinterpreted
resulting in inaccurate genotype characterisation can be minimised.
5 Sampling considerations
DNA fragments tend to be broken down rapidly by endonucleases once the cell dies. While certain
techniques that are based on small fragments of DNA (e.g. microsatellite analysis) are tolerant of high
levels of degradation, and have enabled the use of ancient DNA in some applications (Paabo 1989), a
good sampling programme should aim to ensure that the quality of DNA available for analysis is as
high as possible. This is achieved by ensuring that the endonucleases that cause the degradation are
prevented from becoming active. One option is to freeze the tissue immediately post-mortem, and
avoid any defrosting of the tissue before the DNA extraction. Alternatively, some of the tissue can be
placed in 99% ethanol (in at least three times the volume of the tissue), and given a quick shake to
ensure that the tissue is well immersed. Other alternatives may be available depending on the species
of interest and the tissue type: in the case of blood and small insects, smearing of the tissues (or
blood) on a glass slide and air drying (especially appropriate for warm countries where freezing is
impractical and ethanol is difficult to acquire) is known to yield good quality DNA. As tissues for
different species can differ substantially, a pilot study is required prior to any large sampling
programme, to ensure that the proposed method of storage will provide DNA of sufficient quality to
apply the relevant analysis technique.
The two criteria which are important to DNA analysis are the quality of the DNA in terms of
the degree to which it is degraded and its purity. While DNA of maximum quality (i.e. DNA that has
not been broken down into small fragments) is always desirable, it is not always possible to obtain
DNA which has not already been degraded by endonucleases and for some applications which involve
analysis of relatively small DNA fragments, it is not essential. For other applications, particularly
those involving restriction analysis of non-amplified DNA, high quality DNA is a prerequisite.
59
60
Therefore, the final application should be considered when choosing the type of method for DNA
extraction. The second variable to consider is the purity of the DNA. While phenol-chloroform based
extraction protocols are designed to remove impurities from the DNA, other approaches simply
release the DNA into solution. The latter can be adequate for many PCR based applications, while
some impurities can inhibit restriction enzymes, which affects applications that involve digestion of
the total DNA. Therefore, as with DNA quality, the time and resources invested in DNA purification
should match the application for which the DNA is required. Unlike DNA quality, which is
dependent on the condition of the DNA when it is received, it should always be possible to obtain
pure DNA if required.
One of the most important considerations when undertaking a population study, is to determine
how many individuals need to be sampled. However, for any given study, there will be limiting
resources (e.g. labour, consumables, etc.). Therefore, it is necessary to balance the need collect as
large a sample size as possible, against the need to screen as many populations as possible, and the
need to get allele frequencies from as many loci as possible. Whether the aim is to describe genetic
variation in populations or species, it will be necessary to get an adequate picture of the variation
within the taxonomic unit (population or species), in order to be able to determine the degree of
differentiation among taxonomic units. Similarly, the choice of the number of loci is a form of
genetic sampling, where one or a few may provide a gene phylogeny (which may not necessarily be
representative of the whole genome), so that what is ideally required is an organismal phylogeny,
based on as many loci as possible (Weir 1990b).
These considerations mean that there is no simple recommendation for the "ideal" sample size,
number of samples, or number of loci. Each situation will be case specific, so that it is necessary to
weigh up the above three constraints taking the type of marker into consideration. Some taxonomic
groups show considerable genetic variation among locations. Therefore, in order to be able to
determine genetic differentiation between two populations from different areas for example, it would
be necessary to get a reliable estimate of within population genetic diversity.
6. Molecular Markers
6.1 Mitochondrial DNA analysis
One of the first parts of the gene to be studied was the mitochondrial DNA (see Avise et al. (1987)),
since it was a manageable size (approximately 16,000bp) and could be isolated from the rest of the
DNA by caesium chloride gradient centrifugation. This enabled RFLP analysis of the total
mitochondrial genome by silver-staining following electrophoresis.
Since the isolation of mtDNA involved caesium chloride gradient centrifugation (a hazardous
and time consuming procedure), together with its requirement for fresh tissue, many researchers
60
61
switched to using mtDNA as a probe (Hynes et al. 1989), so that there was no longer any need to
separate mtDNA from nuclear DNA prior to electrophoresis. Instead, total DNA was digested with a
restriction enzyme and then separated by agarose gel electrophoresis. Due to the fragile nature of the
agarose gels, the DNA was transferred and fixed to a nitrocellulose membrane, so that it was possible
to probe the membrane. The probe, consisting of the mtDNA fragment (together with a marker to
enable sizing of the bands), was labelled with a radioactive isotope (e.g. 32P) and hybridised to the
DNA bound to the membrane. Following autoradiography, an X-ray film revealed the banding
pattern, from which it was possible to determine the presence or absence of restriction sites. While
the technique was quite expensive, due to the need to repeat the procedure for each enzyme, it did
avoid the requirement for direct isolation of mitochondrial DNA (which required fresh liver tissue),
and provided analysis of restriction sites over the whole mtDNA genome. It was also quite
demanding with respect to the DNA requirements, such that in order to be able to screen a number of
enzymes, several micrograms of high quality DNA were required.
Following the development of the PCR approach, comparisons between mtDNA sequences
from a wide range of vertebrate species revealed that certain parts of the mtDNA genome were highly
conserved among species. This enabled the development of "universal" primers (Kocher et al. 1989),
that were capable of amplifying segments of the mtDNA from most vertebrates. This has resulted in
the cytochrome b (cyt b) region being studied intensively across a wide range of species, initially by
RFLP analysis and subsequently by direct sequence analysis (Meyer 1993).
6.2 DNA Fingerprinting
Transfer of DNA from an agarose gel to a nylon membrane (or nitro-cellulose) was also a feature of
multi-locus minisatellite DNA analysis (Jeffreys et al. 1985a). Again, total genomic DNA was
isolated, a few micrograms of which were cut with a four base restriction enzyme and separated by
electrophoresis on an agarose gel. The DNA was then transferred to a membrane by Southern blotting
as for the previous technique. While the results of probing with an mtDNA probe and for a
multilocus probe (such as that of Jeffreys 33.15/33.6) yield completely different banding patterns, the
procedure is essentially the same, progressing from probing, hybridisation through to
autoradiography. This approach continues to be used in single locus minisatellite DNA analysis
(Jeffreys et al. 1985b), and anonymous cDNA analysis (e.g. Pogson et al 1995). With respect to these
techniques, the main progress over the last decade has been the improvement of chemiluminescent
(non-radioactive) staining procedures and the ability to PCR amplify the probes; this not only removes
the necessity to grow up the inserts by cloning, but also enables labelling of the probes during PCR.
Minisatellite regions consist of multiple tandem repeats of a core sequence of 9-100 bp in
length (Tautz, 1993). The technique of single locus minisatellite DNA analysis provided the first real
alternative to screening for genetic variability at protein coding loci by isozyme analysis, since it
61
62
enabled a high throughput of samples using an assumed non-coding (selectively neutral) region of
DNA to reveal high levels of genetic variability. A multi-locus minisatellite DNA probe consists of a
sequence of several minisatellite repeat units. Many minisatellite regions share repeat units of similar
sequence. If a sequence consisting of a series of such repeat units is used as a probe, it will therefore
hybridise to many different loci. While this method can thus detect many different highly variable
loci, it is usually not possible to determine which alleles belong to which loci and therefore, to
determine the allele frequencies for each locus (Burke et al. 1991). In order to circumvent this
problem, it is possible to isolate single locus minisatellite DNA probes, which have in addition to the
minisatellite repeat sequence, a unique sequence flanking the minisatellite region that occurs only
once in the genome, and thereby prevents the "probe" from hybridising with more than one
minisatellite locus.
Screening of minisatellite DNA variation with single-locus probes involves digesting genomic
DNA with a restriction enzyme (e.g. HaeIII), separating the fragments by agarose electrophoresis,
transfer of the DNA to a nylon membrane by Southern blotting and hybridisation of the membranebound DNA to the single locus probe (where one of the nucleotides has been labelled with 32P)
under stringent hybridisation conditions (Taggart and Ferguson 1990b). Allelic variation can then be
revealed by autoradiography. Different alleles can occur due to differences in the number of repeat
units that make up the allele, or due to mutations at the restriction sites.
An alternative means of screening for minisatellite variation involves designing PCR
(polymerase chain reaction) primers complementary to the flanking sequences (Jeffreys et al. 1988b),
enabling amplification of that minisatellite region in any individual. The amplified product will then
contain two alleles, whose length will depend on the number of minisatellite repeat units between the
primers. Since the size of the flanking region can be determined from the sequence and the size of
each repeat unit can be similarly determined, it is possible by separating the fragments by
electrophoresis to determine the number of repeat units in each allele. This enables a large number of
alleles to be resolved (n > 30) even on an agarose gel.
While the minisatellite core sequences (sequences consisting of the repeat units only) are
believed to be non-coding, there is usually little information on the function of the flanking
sequences. It is therefore possible to have a locus under the influence of selection in the flanking
sequence, tightly linked to a minisatellite locus, resulting in "hitchhiking selection". In such a
situation, the minisatellite locus may be positioned adjacent to a locus which is subject to selection.
Due to proximity, there is little opportunity for recombination, and so segregation at the two loci can
be linked. In this way, alleles from the neutral minisatellite locus may segregate together with the
alleles from the adjacent locus under selection. As the behaviour of the minisatellite is the same as if
it was directly under the influence of selection, this is termed hitchhiking selection. It is also possible
62
63
that minisatellite regions may be in some way involved in the control of transcription, which could
place these loci under the influence of selection. (Krontiris et al. 1993) have described an association
between mutations at the HRAS1 minisatellite locus in humans and the risk of breast cancer which
may be an example of this phenomenon.
The aforementioned methods of mtDNA, multi-locus and single locus minisatellite DNA
analysis (excluding PCR-amplified minisatellites) and anonymous cDNA markers have in common
the need to first cut the DNA with a restriction enzyme, transfer it to a nylon membrane and then
probe the membrane by hybridising a labelled (usually radio-isotope) probe to the membrane-bound
DNA. There are a number of limitations imposed by this approach.
(1) High quality DNA is required to ensure that the cuts in the DNA are those of the restriction
enzyme and not due to degradation.
(2) The DNA needs to be sufficiently pure that the restriction enzyme is not inhibited.
(3) Several micrograms of DNA are required to ensure a strong signal in the X-ray film
(4) Labelling of the DNA has traditionally involved the use of radio-active isotopes which are both
hazardous and expensive (although chemiluminescent labelling is now becoming more common)
(5) The membranes are very expensive
(6) Band sharpness is lost due to the combined effects of Southern blotting and autoradiography
6.3 Microsatellite DNA analysis
Microsatellite loci are similar to the minisatellite loci described above, except that the repeat units are
di- tri- or tetra-nucleotide lengths (Wright and Bentzen 1994). They are much more common in the
genome than minisatellite loci, enabling these loci to be isolated much more easily. Screening of
allelic variability at these loci using automated DNA sequencers is becoming the standard when the
resources are available (Tully et al. 1993). There is considerable debate in the literature at present
regarding the mode of mutation for microsatellite loci (i.e. infinite allele model versus stepwise
mutation model) (Valdes et al. 1993), the possibility of selection influencing allele / genotype
frequencies and the dangers associated with interpretation of data with a high incidence of null alleles
for some loci. As the number of studies using microsatellite loci increases, some of these issues
should be resolved in the near future.
Tri- and tetra-nucleotide loci are preferred due to their tendency to have a lower incidence of
stutter bands relative to di-nucleotide repeats (O'Reilly et al. 1996). This simplifies determination of
which bands should be used to define the genotype of a particular individual; this is especially
important when automated genotype characterisation is being undertaken by image analysis. By
combining two to four microsatellite loci with non-overlapping allele size ranges, it is possible to
63
64
multiplex different loci, and hence greatly increase the efficiency of genotype characterisation.
Therefore, while the technical aspects of microsatellite analysis have progressed enormously, the
understanding of the underlying assumptions for this class of marker requires further research. In the
interim, caution should be exercised when interpreting microsatellite data.
6.4 Transcribed Sequences
Transcribed sequences refer to the conventional definition of the gene, in which the segment of the
genome was known to code for a particular protein . There has been a renewed interest in these loci
from a number of respects.
(a) Some of the loci have been studied extensively by allozyme electrophoresis and analysis at the
DNA level can provide an improved understanding of the information collected from earlier studies.
(b) Many of the loci contain highly conserved regions which code for proteins (exons) and non-coding
regions (introns) which are not involved in coding. While the former enable the design of PCR
primers that can be effective for a range of species, the latter can be highly variable, and enable the
development of rapid methods of screening for genetic variability (e.g. RFLP analysis).
(c) While it would be inappropriate to assume that allelic variability detected at these loci should be
selectively neutral, the genetic diversity can be used to differentiate among populations, where it can
be shown that the strength of selection is weak (i.e. observed differences cannot be attributable to one
generation of selection), and allele frequencies are temporally stable.
7. Statistical analysis
7.1 "Standardised approach"
There are two primary considerations when choosing between the various software packages for
analysis of genetic data:
(1) What are the relevant evolutionary models to the data and which of the packages base their
analysis on those models?
(2) Which packages are the most "user friendly"?
It is clear from many reports and publications produced in the past that the latter consideration
ranked highest. BIOSYS-1 (Swofford and Selander 1981) facilitated a relatively easy but
comprehensive analysis of data generated from allozymes, and it is evident from the literature that an
"acceptable standard analysis" of such data was present during the 1980s as follows:
(a) calculation of levels of polymorphism and allele frequencies for each locus,
64
65
(b) 2 test for conformance of each of the taxonomic units at each locus with Hardy-Weinberg
expected proportions,
(c) heterogeneity 2 analysis to test for significant differences between allele frequencies in different
taxonomic units,
(d) analysis of F-statistics (Wright 1978) to determine the proportion of genetic diversity within and
among sub-units (FST), to estimate the rate of gene flow (from FST), and to determine if instances of
non-conformance with HW expectations were due to excesses or deficiencies of heterozygotes (F IS),
(e) calculation of Nei's 1972 or 1978 genetic distances, and
(f) generation of a UPGMA dendrogram from the genetic distances.
While this approach does address many aspects relating to genetic variability in populations,
it acceptance as a "standard" analysis, which could be applied to all situations, and the related
presumption that the use of this kind of analysis enabled results from different studies to be easily
compared, was, to say the least, an over simplification the issues involved. It is important at this point
to outline the issues that need to be considered.
65
66
7.2 Statistical problems
With the development of DNA techniques, and particularly the isolation of mini- and microsatellite
DNA loci, many difficulties that had been noted with analysis of allozyme data, became obstacles to
analysis of the data resulting from studies involving these highly variable loci. These can be
described as follows.
(a) Low frequency alleles, and particularly genotypes became more common as the number of
detectable alleles increased. This resulted in an increased risk of sampling error and hence, difficulties
associated with implementation of the various tests, due to the inappropriateness of the conventional
contingency 2 test. Also, contrary to suggestions in some of the early literature on mtDNA analysis ,
the increased resolving power of the DNA methods increases the potential for sampling error and thus
larger sample sizes are required (Moritz et al. 1987).
(b) Although the mutation rate of allozyme loci is generally regarded as being low (e.g. 10 -6),
the rate of mutation of DNA loci can vary considerably (i.e. from 10-1 for some minisatellite loci to
<10-6 for some coding sequences). Also, the mode of mutation varies among different DNA loci, and
to date the mechanism by which mutations arise in VNTR loci is not clearly defined (Henke et al.
1993) and thus it is impossible to determine which model of evolution can be applied to the
interpretations of the data.
(c) While FST analysis can be used to determine hierarchical structuring of genetic diversity
within a species, in most cases the statistical significance of this structuring is left untested. The
importance of such structuring is generally accepted as critical to management and conservation
considerations, and thus it would seem imperative that appropriate testing of this aspect should be
undertaken.
(d) Where the object of the study is to determine the genetic relationships between different
taxonomic units, or to describe population structure, great caution is required. Many studies, due to
limited resources or availability of polymorphic loci are restricted in this respect. In studies where
only a small number of loci are screened, there is a high risk of any single locus being under the
influence of either balancing or directional selection (i.e. non neutral), and biasing the conclusions as
a consequence (Nei 1987). What is therefore required is that as many loci as possible are screened
and only a consensus-type dendrogram should be used for interpretation, where the data has been
subjected to numerical resampling to highlight unreliable associations of taxonomic groups. The
genetic distance calculated together with algorithm used to generate the dendrogram, are also of great
importance, as these reflect the assumptions that are being made regarding the evolutionary processes
that have resulted in the phylogeny at the time of analysis (Felsenstein 1985).
(e) The basic aims of the study need to be defined at an early stage, in order to determine the
most appropriate analyses to undertake. Discrimination between taxonomic units can be carried out
using loci which are under the influence of selection, provided that the allele frequencies are stable
66
67
over time. Separation of taxonomic units can be defined by significant results from a heterogeneity 2
analysis, or principle component analysis. Studies aiming to describe population structure need to be
based on selectively neutral loci, where there is at least some impression of the mutation rate: high
mutation rates result in homoplasies and thus tend to underestimate differentiation.
It is beyond the scope of this discussion to attempt to review the many software packages
available. With respect to their "user friendliness", such a discussion might invariably be biased by
the experience of this author. Different analysts will have different backgrounds and for example,
while those familiar with MS DOS may have little difficulty negotiating around MS DOS based
programs, others whose experience has been largely focused on the MS Windows / Apple
Macintosh environment would prefer to avail of software with a Graphical User Interface (GUI).
Therefore, it is not the actual software package that requires consideration, but, as with any general
statistical packages, it is the particular tests implemented that are important. While the composition
of some software packages tend to reflect the author(s) opinions on which components are most
appropriate, by inclusion of only those components, many of the recently developed packages have
included a whole range of possible options for undertaking each aspect of the analysis. This again
puts the onus on the user to choose the most appropriate analysis, in which the following should be
considered.
7.3 Recommended approaches for analysis of genetic diversity
i) Number of loci tested and the level of polymorphism. Even if some of the loci are later excluded
from the analysis due to lack of variability or technical difficulties, it is important for studies
describing the relationships among populations to include information on all of the loci initially
tested. Selection of loci (e.g. based on their levels of polymorphism) can sometimes be biased by
choosing the loci that show the greatest differences, which may be influenced by factors such as
directional or balancing selection.
ii): Allele and genotype frequencies: Allele frequencies are the normal means of citing the "raw data".
However, this is only of value when there is no deviation from Hardy-Weinberg expected proportions.
When the genotypic frequencies do not conform with HW equilibrium, then allele frequencies are
inadequate to describe the raw data, since the genotypic frequencies do not have a normal distribution
and cannot therefore be approximated based on HW expectations (Weir 1990a).
iii) Homozygosity and heterozygosity: These have taken on added importance with VNTR loci due to
problems with "null alleles". Reporting of excesses or deficits of homozygotes / heterozygotes should
be accompanied by data detailing how many (if any) of the samples were not scored for a particular
locus. Failure to get data from an individual sample may reflect the presence of a "null homozygote".
iv) Conformance with Hardy-Weinberg expected proportions: As stated earlier, the conventional 2
test is not appropriate for most VNTR data, due to the problems with very small expected numbers in
67
68
many of the categories. The preferred alternative is based on Fisher's exact test (Guo and Thompson
1992). Computational difficulties associated with the implementation of the exact test with a large
number of alleles has been overcome using a pseudo-random method of testing subsets of the data as
described by Guo and Thompson (1992). HW assumes that genes are distributed independently into
genotypes. Therefore, this test for HW conformance involves simulation that assume independent
distribution of the genotypes, using the same allele frequencies as in the original sample. By
repeating the process several times (permutations), it is possible to build up a distribution of the
genotype frequencies that would be expected based on HW expectations. The original genotypic
frequencies can then be compared with the permutation results to determine the probability of the
observed frequencies conforming with HW expected proportions. This overcomes the difficulty
associated with small expected numbers for individual classes.
7.4 Recommended approaches for analysis of genetic differentiation
i) Heterogeneity test: The conventional heterogeneity 2 test of allele frequencies is also of limited
value with respect to VNTR loci, due to difficulties with large numbers of alleles with low expected
numbers in many of the classes. This problem can be overcome by either permutation tests or
numerical resampling techniques. The null hypothesis is that the samples are simply random subsets
of a larger population. The permutation tests therefore determine if the allele frequencies reflect
alleles distributed independently into two samples. This is done by pooling the data, and randomly
sampling the alleles equivalent to those of original population sizes from the pooled data set.
Permutations of this procedure generate a distribution of the allele frequencies expected based on this
random model. Comparison of the observed data with this distribution (i.e. is it within the 95%
confidence intervals) can be used to test the null hypothesis. Numerical resampling such as
bootstrapping or jackknifing operates on a slightly different basis (Weir 1990b). [Bootstrapping
involves randomly resampling subsets of the original data to provide new samples of equivalent size
as the original (sampling with replacement); jackknifing eliminates one observation of the data (e.g.
individual / locus) at a time, so that the number of new data sets is equal to the number of
observations]. By calculating the variance within each data set (95% CIs from bootstrapping; SDs
from jackknifing), it is possible to determine if different populations do not significantly overlap (e.g.
their 95% CI's do not overlap).
ii) Population structuring: There is an ever increasing range of algorithms for calculating F-statistics.
Wright (1978) defined the differentiation between subpopulations relative to the total as a parameter
called FST. Analogous parameters have now been defined by several authors based on alternative
assumptions about the evolutionary model and consequent modifications to the algorithm. The two
most commonly used algorithms until recently were those of , and G ST, however while the latter has
the advantage of not showing negative values when sub-populations appear similar, it is also biased in
68
69
those circumstances and results in an overestimate of the degree of sub-structuring. Over the last two
years, many of the newer software packages have included calculations of R ST, which assumes that
microsatellite evolution is primarily driven by the stepwise mutation model. Due to the difficulties
involved in testing the mutation process of microsatellites, because of the fact that repeat units do not
usually vary across a repeat region, this assumption appears to be flawed. While early models
assumed that mutations at microsatellite loci took the form of the gain or loss of a single repeat unit
(conforming to a strict stepwise mutation model), available empirical evidence has demonstrated that
this is certainly not the case (some of the latest models are therefore not based on a strict one-step
mutation model). In fact, while recent studies by Jeffreys and co-workers have shown that a
completely different mutation process (gene conversion type) dominates in minisatellite loci, there is
no evidence to suggest that the same process is not also dominant in microsatellite loci. Therefore,
any models which assume that there is a relationship between the number of repeat units and the
divergence time of any two VNTR alleles, need to be interpreted with great caution. Such analysis
should only be conducted alongside analysis based on alternative models, where any differences
between the results can be noted.
iii) Genetic distance: Until recently much of the literature concerning population studies was based on
Nei's 1972 and 1978 genetic distance measures. For situations where the population(s) have been
stable over a long time, these are appropriate measures. However, in the case of many aquatic
species, the extent of founder effects and genetic drift can be considerable, and therefore the genetic
distance measures used should be appropriate to this. An alternative measure which is more
appropriate to situations where genetic drift is an important factor is (Reynolds et al. 1983) coancestry coefficient. This is based on FST values, where the relationship between genetic distance and
time is approximately linear in instances of recent divergence. The major difference between the
genetic distances of Nei (including the more recent DA measure ) and the FST based measures is that
the former weights intermediate frequency differences highly and differences between extreme
frequencies much less (i.e. a much greater genetic distance would appear from differences between
0.55 and 0.45 than from 0.00 and 0.10). While the same occurs in the case of the FST based measures,
the "problem" appears to be much less. Therefore, further research is required to determine the
"optimal" genetic distance measure.
It should also be noted that where the genetic distances measured are very low (e.g. < 0.005),
there is likely to be a large error associated with the distance (unless it has been derived from a very
large number of loci).
iv) Dendrograms: UPGMA dendrograms have tended to predominate in past literature. While this
method of phylogeny reconstruction has some favourable attributes, its underlying assumptions are
rarely met. There are several "additive" type dendrograms that can be used which should be
preferred. An example of one of these is the Neighbour-Joining method of (Saitou and Nei 1987).
69
70
Whatever algorithm is used for generating the dendrogram, bootstrapping of the allele frequencies
(followed by calculation of genetic distances, etc.) should be used to build a consensus dendrogram to
enable the reliability of the nodes to be assessed. This involves bootstrapping over loci, which
assumes that each of the loci represent independent data sources (independent "samples" from the
genome). As with all measures of genetic relatedness, generation of a dendrogram relies on the data
that has been collected. Therefore, it is essential to have as many loci as possible included in the
study, to enable the dendrogram to be interpreted as an "organism tree" rather than a "gene tree" (in
the case of single locus data) (Nei 1987).
9. Sampling considerations
DNA fragments tend to be broken down rapidly by endonucleases once the cell dies. While certain
techniques that are based on small fragments of DNA (e.g. microsatellite analysis) are tolerant of high
levels of degradation, and have enabled the use of ancient DNA in some applications (Paabo 1989), a
good sampling programme should aim to ensure that the quality of DNA available for analysis is as
high as possible. This is achieved by ensuring that the endonucleases that cause the degradation are
prevented from becoming active. One option is to freeze the tissue immediately post-mortem, and
avoid any defrosting of the tissue before the DNA extraction. Alternatively, some of the tissue can be
placed in 99% ethanol (in at least three times the volume of the tissue), and given a quick shake to
ensure that the tissue is well immersed. Other alternatives may be available depending on the species
of interest and the tissue type: in the case of blood and small insects, smearing of the tissues (or
blood) on a glass slide and air drying (especially appropriate for warm countries where freezing is
impractical and ethanol is difficult to acquire) is known to yield good quality DNA. As tissues for
different species can differ substantially, a pilot study is required prior to any large sampling
programme, to ensure that the proposed method of storage will provide DNA of sufficient quality to
apply the relevant analysis technique.
The two criteria which are important to DNA analysis are the quality of the DNA in terms of
the degree to which it is degraded and its purity. While DNA of maximum quality (i.e. DNA that has
not been broken down into small fragments) is always desirable, it is not always possible to obtain
DNA which has not already been degraded by endonucleases and for some applications which involve
analysis of relatively small DNA fragments, it is not essential. For other applications, particularly
those involving restriction analysis of non-amplified DNA, high quality DNA is a prerequisite.
Therefore, the final application should be considered when choosing the type of method for DNA
extraction. The second variable to consider is the purity of the DNA. While phenol-chloroform based
extraction protocols are designed to remove impurities from the DNA, other approaches simply
release the DNA into solution. The latter can be adequate for many PCR based applications, while
70
71
some impurities can inhibit restriction enzymes, which affects applications that involve digestion of
the total DNA. Therefore, as with DNA quality, the time and resources invested in DNA purification
should match the application for which the DNA is required. Unlike DNA quality, which is
dependent on the condition of the DNA when it is received, it should always be possible to obtain
pure DNA if required.
One of the most important considerations when undertaking a population study, is to determine
how many individuals need to be sampled. However, for any given study, there will be limiting
resources (e.g. labour, consumables, etc.). Therefore, it is necessary to balance the need collect as
large a sample size as possible, against the need to screen as many populations as possible, and the
need to get allele frequencies from as many loci as possible. Whether the aim is to describe genetic
variation in populations or species, it will be necessary to get an adequate picture of the variation
within the taxonomic unit (population or species), in order to be able to determine the degree of
differentiation among taxonomic units. Similarly, the choice of the number of loci is a form of
genetic sampling, where one or a few may provide a gene phylogeny (which may not necessarily be
representative of the whole genome), so that what is ideally required is an organismal phylogeny,
based on as many loci as possible (Weir 1990b).
These considerations mean that there is no simple recommendation for the "ideal" sample size,
number of samples, or number of loci. Each situation will be case specific, so that it is necessary to
weigh up the above three constraints taking the type of marker into consideration. Some taxonomic
groups show considerable genetic variation among locations. Therefore, in order to be able to
determine genetic differentiation between two populations from different areas for example, it would
be necessary to get a reliable estimate of within population genetic diversity.
71
72
8. Protocols
8.1 DNA Extraction by Phenol-Chloroform method
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Prepare PK Buffer (proteinase K buffer) as follows
EDTA (0.5M)
SDS (10%)
Tris (10mM) pH 8.0 with HCl
H2O
10.0ml
2.5ml
0.5ml
37.0ml
Label one tube for each extraction and add 400L PK buffer and 4L proteinase K (20mg/ml)
to each tube
Add approximately 60mg tissue to each tube
Incubate at 50C for at least 3 hours (or overnight)
Add 10L RNAase (10mg/ml), mix and incubate for 90 min at 37C
Add 400L phenol (hydrolised) to each tube, mix by vortexing. [CAUTION! Both phenol
and chloroform are highly corrosive and volatile. It is essential to wear protective glasses,
gloves and lab-coat when handling these substances. All work should be carried out in a fume
cubbord.]
Add 400L chloroform-isoamyl alcohol (24:1) to each tube, mix by vortexing.
Centrifuge at 13,000g for 10 minutes. Label a new tube for each sample.
Remove 300L from the aqueous phase (top layer) and transfer to a new tube.
Add 600L 99% EtOH, and shake the tube to precipitate the DNA. Remove the supernatant
and allow the pellet to dry out for 5-10 min.
Re-suspend the DNA pellets in 50-100L TE buffer (Tris-EDTA).
Test the DNA quality, by running out a 1L aliquot (mixed with 5L 1x loading dye) on a 1%
1X TBE agarose gel alongside a size marker. High molecular weight DNA appears as a sharp
band close to the origin, while degraded DNA appears as a smear along the lane.
[Recipe for 2 liters 5X TBE Buffer:
108g Trisma Base (IRRITANT)(Sigma Chemical Company)
55g Boric Acid (X)(Sigma Chemical Company)
40ml 0.5M EDTA (pH8.0) (X)(Sigma Chemical Company)
Add distilled water up to a volume of 2 litres.]
72
73
8.2 PCR Amplification a Minisatellite DNA locus
(1) Label sterile 0.5 ml tubes.
(2) Defrost on ice H2O, x10 Buffer, MgCl2, dNTPs and both primers
(3) Add 100ng (1L) of DNA template (e.g. cod) to the base of each tube
(4) Create a master mix from the rest of the solutions adding 10%
of each to allow for pipetting error
X10 Reaction Buffer
dNTP (1.25mM stock)
MgCl2 (25mM stock)
Primer 1 (20uM stock)
Primer 2 (20uM stock)
H2O
Thermoprime plus polymerase
Volume per tube
2.0L
4.0L
1.6L
0.5L
0.5L
11.3L
0.1L
Final concentration
x1
0.25mM
15mM
0.5M
0.5M
1U
(5) Dispense stock solution from (4) into each tube and overlay with 30L mineral oil
(9) Seal the tube and start the main PCR programme cycle.
[950C for 2min] x 1 cycle
[940C for 1 min; 600C for 1 min and 720C for 1 min] x 30 cycles.
(10) Following PCR amplification, quality of products should be assessed on a 0.5% TBE gel by
running out 5L of the reaction (with 1L 6x loading dye). If one or two sharp bands appear (may
be faint), proceed.
(11) Separate products by loading 12L on a large (20x30cm) agarose gel apparatus, to enable alleles
to be clearly separated, even when they are only differentiated by a single repeat unit.
8.3 Preparation of an Agarose gel
1. Determine the volume required. Measure the internal dimensions of the gel casting tray and
calculate the volume required to make a gel of 5mm in thickness (e.g. gel size of 200 x 300mm
requires 20 x 30 x 0.5 ml agarose solution = 300ml).
2. Determine the gel concentration required. For larger fragments (> 1kb), use less agarose (i.e. 0.7
– 1.0%) to enable these fragments to pass through the matrix during electrophoresis and be
separated. In the case of smaller fragments, higher concentrations are necessary (i.e. 1-3%), to
prevent the bands from becoming diffuse. A 1% concentration provides reasonable results over a
wide range of fragment sizes
3. For a 300ml gel, add 3.0g agarose to 300ml 1x TBE buffer in a conical flask, and heat to boiling
point with a microwave oven. [CAUTION! When removing the flask from the microwave, the
melted agarose can sometimes contain trapped air, causing it to overflow once disturbed]. Add
15L of ethidium bromide (i.e. 5L /100ml of a 20mg/ml stock), and swirl the fask gently to mix.
[CAUTION! Ethidium bromide is highly mutagenic, hence, any of the equipment which come in
contact with the gels should not be handled without gloves]. Cover the top of the flask with foil
to prevent evaporation.
4. Seal the ends of the gel casting tray with masking tape and place on a level surface. Insert the
combs in the appropriate slots on the casting tray to form the wells for applying the samples.
5. Cool the melted agarose to approximately 50C prior to pouring (for approx. 40 minutes with
occassional swirling), to avoid warping the casting tray. Pour the gel, removing any air bubbles
with a Pasteur pipette. Allow the gel to set.
6. When the gel has set, gently remove the combs and place the tray in the electrophoresis tank.
Add 1x TBE buffer until the gel has been submerged by approximately 5mm. The gel is then
ready for sample application to the wells.
73
74
8.4 Preparation of an acrylamide gel for the Li-Cor Automatic
DNA Sequencer.
a) Preparation of gel plates
1. Clean the two gel plates very carefully both with distilled water and ethanol to remove all dust
particles that might cause air bubbles.
2. Insert 0.25mm spacers, on each side between the two plates and tie them with the clamps (leave
the top clamps open until the gel has been poured and the top spacer has been inserted).
3. Position the gel plates at a shallow angle, approximately 30C to help to pour the gel.
b) Recipe for 33cm gel (increase volumes by 50% for a 41cm sequencing gel)
1. Weight out 10.5g Ultrapure Urea (IRRITANT)(Amersham Life Science).
2. Add: 10ml distilled water
3. 5ml 5X TBE Buffer
4. 2.8ml RapidGelTM-XL-40% Concentrate (TOXIC) )(Amersham Life Science).
5. Place parafilm over the mouth of the container and mix thoroughly until all urea has dissolved.
6. Add 25l Temed (CORROSIVE)(Sigma Chemical Company).
7. Add 175l of Ammonium Persulphate solution (1mg/10l) (CORROSIVE)(Sigma Chemical
Company).
8. Mix thoroughly. Suck the mixture (approx. 20ml of solution) into a pipette.
9. Pour the solution into the two plates starting from one side towards the other and then back to the
middle keeping the same pressure.
10. Check if any bubble is present and if so remove them.
11. Apply the top spacer (to leave the space for the comb) and insert the casting plate, tie the top
clamps and leave the gel to polymerise for at least 1 hour (not more than 2-3 hours because it will
dry out).
12. Wash immediately the container and the pipette to prevent acrylamide from polymerising in them.
c) Prepare the gel to be loaded
1. When the gel has solidified, remove the top spacer and clean the excess polyacrylamide
2. Place the gel plates inside the DNA sequencer, insert the buffer chamber and tighten the clamps
(hand tight only)
3. Pour 500ml of 1X TBE Buffer into each buffer tray. Clear the loading edge of excess
polyacrylamide by flushing it with a Pasteur pipette.
4. Place the lids on each buffer tray and connect the circuit from the gel plates to the automatic
sequencer. Close the cover.
5. Create a new file for the data on the hard drive of the automatic sequencer as follows. a) open the
program (data collection); b) create a new directory (file, new, create), c) open / edit the
configuration file as appropriate for the gel in use, d) turn on the scanner, e) click ENTER and
check that the circuit is closed.
6. Pre-run the gel for 15 - 20 min. to ensure that it is adequately heated and prepared before loading
the samples.
d) Preparation of the samples
74
75
1. Pipette 2l of Loading Dye - Formamide ACS Reagent (TOXIC)(Sigma Chemical Company)
solution into 0.5ml sterilised microcentrifuge tubes.
2. Add 1l of PCR product to each and mix by pipetting a few times.
3. Heat each tubes to 85C for 60 seconds on a PCR thermal cycler or water bath to denature the
DNA (to single stranded DNA).
e) Loading the gel
1. Turn the machine off and open the cover.
2. Remove the lid from the top buffer tray.
3. Insert gently a 48- or 64-well comb (depending on the number of samples) between the two plates
until the tips of the comb are approx. 1mm into it.
4. Load 0.5l of PCR products-Formamide solution into each well (ie between the teeth of the
comb). Also load 0.5l of a standard DNA size ladder every 10-20 wells to measure the
molecular weight of the DNA fragments on the gel.
5. Replace the lid and close the cover.
6. Set the auto gain (options, auto gain, auto).
7. Focus the gel (scanner control, options, focusing, auto) (check that the curve is approx. like a
normal distribution).
8. Set auto gain again.
9. Turn on voltage and laser. The auto gain and the focusing are set after the samples are loaded in
case the gel plates get disturbed while loading.
75
76
8.5 Cycle Sequencing (Amersham kit)
1. Label 0.5ml microcentrifuge tubes
(a) one tube labelled per plasmid-primer combination for master mix solutions detailed in (2)
below
(b) four tubes per plasmid-primer combination for the ddG, ddA, ddT and ddC nucleotide mixes
(i.e. for (a) labels tubes 1-10, so that for (b), label 1G, 1A, 1T, 1C, 2G, 2A, etc.)
2. (a) Add 1g of plasmid DNA to each tube from step 1(a) (i.e. tubes 1-10) and add H2O to bring
the total volume to 20L
(b) Then add:
2.0 L primer (2M),
3.0 L DMSO,
0.5 L 5M Betaine
3. Add 2L from each of ddG, ddA, ddT and ddC mixes to appropriate tubes labelled from step 1(b).
4. Dispense 6 L of master mix from step 2 to each tube from step 3.
5. Seal the tubes, place on the PCR block and run the following program:
(95C – 2.5 min.) x 1 cycle
(95C – 20 sec., 56C – 20 sec., 72C – 20 sec.) x 30 cycles
6. Pre-run the sequencing gel to ensure that it has reached 50C before applying the sequencing
reactions
7. On completion of the program, add 4 L formamide loading dye to each tube and heat to 850C for
1 min (i.e. on a PCR block).
8. Load 0.5-1.0L (depending on comb size) from each tube on to the gel in the LiCor 4200
automated DNA sequencer
9. (a)Run “Auto Gain” on the computer to calibrate the dynamic range of the fluorescence detection
(b) Run “Focus” to ensure that the laser is focussed on the centre of the gel
(c) Re-run Auto Gain to refine calibration based on the correct focus
10. Start the electrophoresis by switching the laser and voltage on.
76
77
Appendix 1. Evaluation of the relative merits of the main techniques where the evaluation is based on criteria
related to the implementation of these techniques (NB! This page to be read in landscape format)
Gene Type
Analysis Technique
Tissue
Preservation
Minimum
Tissue
Quantity
DNA Quality
Cost
Isozymes
Starch Gel
-80C
Medium
Not applicable
Low
MtDNA
CsCl + RFLP
Fresh liver only
High
Very High
High
RE + Probing of filter
-20C / 95%
EtOH
Medium
High
High
Low
Medium
Low
Low
Medium
High
Medium
High
Medium
Low
Low - Medium
Low - Medium
PCR + RFLP
-20C / 95%
EtOH
PCR + Direct
Sequencing
-20C / 95%
EtOH
Minisatellites
RE + Probing of filter
-20C / 95%
EtOH
PCR amplified
-20C / 95%
EtOH
Microsatellites
PCR amplified
-20C / 95%
EtOH
Low
Low
Low - Medium
Anonymous cDNA
RE + Probing of filter
-20C / 95%
EtOH
Medium
High
Medium - High
Single Copy
Sequence
PCR + RFLP
-20C / 95%
EtOH
Low
Low - Medium
Low
Low
Low - Medium
High
Low
Low - Medium
Medium
PCR + Direct
Sequencing
-20C / 95%
EtOH
PCR + SSCP
-20C / 95%
EtOH
77
78
Appendix 2. Practical evaluation of the main techniques for differentiating betweeen species,
populations and individuals (NB! This page to be read in landscape format).
Gene Type
Analysis Technique
Phylogenetics at Species Level
Population Genet
Isozymes
Starch Gel
Fair
Good but lack of p
can be limiting in s
CsCl + RFLP
Good - but too many bands can
be a problem; also very slow etc.
MtDNA
RE + Probing of filter
PCR + RFLP
Good - but too many bands can
be a problem
Fair (Very Slow)
Fair (Very Slow)
Very Good
Good (if enough R
available)
PCR + Direct Sequencing
Excellent
Very Slow but new
sequencing may he
RE + Probing of filter
Not suitable
Very good
PCR amplified
Not suitable
Very Good
Microsatellites
PCR amplified
Not suitable
Very Good
Anonymous cDNA
RE + Probing of filter
Not suitable
Good
Single Copy Sequence
PCR + RFLP
Good
Good (if enough va
detected)
PCR + Direct Sequencing
Excellent
Fair (too slow and
PCR + SSCP
Not suitable
Potentially Good f
populations
Minisatellites
OoooooooooooooOOOOOOOOOOooooooooooooooo
78
79
PLANT DNA MARKERS
M. Heun, NLH-Ås
Electronic version not available.
Please refer to handouts from prof. Heun during the course.
ooooooo OOOOOO oooooooo
79
Download