SOLiD Sequencing and the Thousand Genomes Project The Thousand Genomes Project (www.1000genomes.org) is a large-scale sequencing project involving an international collaboration of scientists who are attempting to provide the scientific community with a catalog of variants that are present at the 1% (or greater) level in the human population across most of the genome, and down to 0.5% (or less) within genes. This high-resolution map of variation will include single nucleotide polymorphisms as well as structural variants. Once complete, this project will produce the clearest picture yet of the spectrum of normal human genetic variation, and will likely require the sequencing of at least 1000 individual human genomes. At the Human Genome Sequencing Center (Baylor College of Medicine, Houston, TX) we are working to produce over 200 mappable gigabases of sequence for the thousand genomes pilot projects over the course of a 5 month period using 6 ABI SOLiD sequencers. We are in the process of producing 140 mappable gigabases of SOLiD data on 24 HapMap samples (as part of the 2x “light sequencing” pilot project), and we have already produced over 60 mappable gigabases of SOLiD data on a single HapMap sample (as part of the 20x “deep sequencing” pilot project). Our deep sequencing data consists of about 14x coverage using 25bp tag paired-end libraries and about 9x coverage using 35bp fragments. The details of our SOLiD sequencing program for the Thousand Genomes Pilot Projects will be presented here, along with a report on our experiences with SNP calling using deep sequencing data and the vendor-supplied mapping/SNP-calling software, and our progress on developing software to identify structural variants by looking for clusters of paired-end reads which show characteristic rearrangement signals.