What is the basis for the often-quoted statement that all humans are 99

advertisement
What is the basis for the often-quoted statement that all humans are 99.9%
identical? Is this estimate correct? Why or why not?
The Collins crew (HGP):
Francis Collins placed this claim into context very well:
“The Human Genome Project has helped to inform us about how remarkably similar all
human beings are – 99.9% at the DNA level. Those who wish to draw precise racial
boundaries around certain groups will not be able to use science as a legitimate
justification. However, studying the 0.1% of human genetic variations, particularly the
distribution of single nucleotide polymorphisms, between affected and nonaffected
individuals will significantly inform biomedical researchers about the genetic
contributions to complex diseases such as cancer, diabetes, and mental illness.” (Collins
and Mansoura, 2001)




When he makes this claim he references a study published in 1996 “Increasing the
information content of STS-based genome maps: identifying polymorphisms in
mapped STSs” (Kwok, et al., 1996).
This study used sequence tagged sites and compared single nucleotide differences
between them to get a measure of the degree of heterozygosity between any two
randomly chosen chromosomes.
It is not clear to me how the 99.9% figure is derived from this work as it is not
explicitly described in this publication.
Collins has taken a strong public stance on this issue and goes on to say: “Although
genetic variations do exist, they seldom segregate in a manner that conforms to the
racial boundaries constructed by sociopolitical means. The distribution of the 0.1%
of differences among us is revealing. Studies have proven that the vast majority of
these genetic variations are found within and not between populations, indicating that
these variations were present in our shared ancient human founder group” (Collins
and Mansoura, 2001).
The Venter crew (of Celera fame):
Craig Venter has often quoted the famous 99.9% figure: “The study of the genome
supports the fundamental unity of human beings. We all share at least 99.9% of the
nucleotide code in our genome. And yet it is remarkable that the diversity of human
beings at the genetic level is encoded by less than 0.1% variation in our DNA.”



Venter references his own human genome sequence publication (Science, 2001) as
well as a major publication by the SNP consortium.
This paper by the SNP consortium describes in detail an experiment in which the SNP
consortium analyzed 4.5 million sequence reads which generated a total of 1.2 billion
aligned bases and 920,752 heterozygous positions.
This work considered sequence variation as a “normalized measure of
heterozygosity” or the likelihood that a nucleotide position will be heterozygous
between two randomly chosen chromosomes.

They found the value of this measure to be 7.51 X 10-4 when all human chromosome
are averaged. This works out to 99.925% (SNP Map working Group, Nature, 2001)
Many earlier studies came up with figures close to this by considering smaller datasets
before SNP DB was available.
 cDNA sequences
 Regions of X chromosome
 Smaller genomic SNP studies
- More recent reviews which quote the 99.9% figure simply reference to various reviews
such as the one written by Collins and Mansoura OR alternately a review by Venter.
- Basically Collins and Venter get ‘credit’ for this observation as the respective leaders of
the two human genome sequencing projects.
Download