Longitudinal Analysis of fecal microbial

advertisement
GCAT-SEEKquence
The Genome Consortium for Active Teaching
NextGen Sequencing Group
NextGen Sequencing Request Form
Complete fields below, save file with your last name at the beginning of
the filename (e.g. newman-GCAT-SEEK Sequence request form.pdf) and
email to Vincent Buonaccorsi <BUONACCORSI@juniata.edu>
A. Contact Information
1. Name: Regina Lamendella
2. Department: Biology
3. Institution: Juniata College
4. Phone Number: 510-486-7384
5. Email Address: RLamendella@lbl.gov
B. Project Information
1. Title: Longitudinal Analysis of fecal microbial communities from an Inflammatory Bowel Disease
Cohort Using Ultra High Throughput sequencing.
2. Category: Metagenomics
3. Total amount of sequence requested: 90 Gigabases
4. Preferred technology: Illumina HiSeq
5. Do you have funds for a partial run next Spring? Yes
C. Describe the background, hypotheses and specific aims (500 words max)
Inflammatory bowel diseases (IBD) are increasing in prevalence in Europe and North America, which likely
is due in part to the Western lifestyle. IBD can be divided into two disease categories: Ulcerative Colitis (UC) and
Crohn’s Disease (CD). Both are chronic, relapsing, immunologically mediated disorders that can have severe
physical consequences. The hypothesis is a breakdown in the host-microbial mutualism is a consequence of a
general breakdown in the balance between protective and harmful bacteria (dysbiosis) in the gut. Thus, there is a
growing search for specific members of the intestinal microbial community that provide the antigens which spark
the inflammation for IBD. Some changes in the microbial community related to IBD have been noted, including
reduced diversity of the Firmicutes, the presence of bacteria that are not normally considered to be commensals,
and increased concentrations of E. coli [1]. While current research has demonstrated a significant difference in the
composition of the gut microbiota of IBD patients as compared to healthy individuals, these studies have been
based on a single time point. There are a number of other factors that can have an impact on IBD, which include
disease progression over time (flare-ups vs. remission), diet, surgery and drug use. The goal of this study is to
obtain samples from the same individuals under conditions of active disease and remission, to overcome the
problem of individual variations in gut microbial compositions, which will lead to a clearer understanding of the
interplay between the host inflammatory response and the gut microbiota.
Specific Aim 1: Carry out a longitudinal analysis of the gut microbial communities in fecal samples from patients
with different IBD phenotypes compared to healthy individuals to assess the stability of the gut microbiota.
Specific Aim 2: Carry out a longitudinal analysis of the gut microbial community in fecal samples from IBD
patients to assess changes in the gut microbiota associated with disease severity.
Advancements in sequencing technologies, which offer greater numbers of sequencing reads at much
lower costs, are revolutionizing our understanding of microbial communities. Very recently, the paired end
Solexa/Illumina technology has been used to perform sequencing at the depth of millions of 16S rRNA gene
sequences per sample, enabling exhaustive coverage of diverse environments such as the gut. Further, this
technology is amenable to a high level of multiplexing, which increases its utility for examining hundreds and even
thousands of complex sets of samples [2]. This study plans to leverage the Illumina sequencing platform for 16S
rRNA amplicon analysis (iTags) to deeply survey the microbial community structure of this IBD cohort.
D. Describe the methods [sample prep, calculation of amount of sequence required, analysis plan
IRB approval has been obtained to prospectively collect stool samples for microbial profiling, from 30
patients from American and Swedish IBD cohorts. Metadata collected for these IBD phenotypes include, age, sex,
smoking history, serological markers, history of antibiotic use, resectioning, flare-up, and IBD family history.
Samples are being collected from each patient every three months over a 15 month period. Samples are frozen at
-20 °C until further processing. Genomic DNA will be extracted using the MoBio Fecal DNA extraction kit according
to the manufacturer’s instructions, with an additional heat step prior to beadbeating (60 °C for 10 min) to aid in
efficient cell lysis. Paired-end, barcoded libraries of hyper-variable (V6 region) 16S rDNA fragments amplified from
samples will be constructed and sequenced using the Illumina Hiseq platform. A total of 90 Gigabases of sequence
data will be necessary (1,000,000 sequences per sample x 300 bp length (150 bp paired end) x 300 samples (150
samples in duplicate).
Sequence data will be analyzed using the QIIME pipeline [3]. Briefly, the 16S rRNA gene sequences will be
clustered with uclust and assigned to operational taxonomic units (OTUs) with 97% similarity. Representative
sequences from each OTU will be aligned and assigned with Pynast [4] using the Greengenes core sequence set.
As the number of sequence reads in each sample may vary, the dataset will be rarified prior alpha diversity
calculations. For beta diversity analysis the weighted UniFrac distance matrix will be used for the principal
coordinate analysis (PCoA). Multivariate community analysis and correlation to metadata will be performed within
PCORD 5 software using normalized OTU tables (genus-level) generated in QIIME.
E.
Describe the role and number of undergraduates involved in the project, and how they would benefit.
If funded, I plan to invite 10 undergraduates in Spring 2013 into my research program with the goal of
learning bioinformatic analyses for this ultra-highthroughput sequencing data. Using web-based tutorials,
students will analyze the microbial community structure of these samples and employ statistical models to
correlate health-related metadata to sequence data. This research course will provide students with the
foundation for understanding how interdisciplinary approaches are necessary to solve complex health-related
problems. Students will also synthesize primary literature related to informatics issues relevant to highthroughput sequencing, gut microbiomics, and applications of biotechnology in disease therapy. The results of this
project will serve as basis for a publication including research students. Additionally, this project will serve as the
basis for one and two week modules in General Biology and Microbiology courses, exposing potentially hundreds
of students to how high throughput sequencing technologies are revolutionizing modern science.
F.
I agree to administer the GCAT-SEEK pre- and post-activity assessment test for students and to complete the
faculty post-utilization survey. _X_ yes, ____ no
G. Describe any other broader impact or intellectual merit considerations.
Analysis of microbial community structure in this longitudinal inflammatory bowel disease cohort can
potentially reveal bacteria or groups of bacteria associated with flare-up and remission states. These findings can
potentially lead to the development of novel biomarkers or even future potential therapeutics for the disease.
H. References
1.
Willing BP, Dicksved J, Halfvarson J, Andersson AF, Lucio M, Zheng Z, Jarnerot G, Tysk C, Jansson JK,
Engstrand L: A Pyrosequencing Study in Twins Shows That Gastrointestinal Microbial Profiles Vary With
Inflammatory Bowel Disease Phenotypes. Gastroenterology (2010) 139(6):1844-U1105.
2.
Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, Fierer N, Knight R: Global
patterns of 16S rRNA diversity at a depth of millions of sequences per sample. P Natl Acad Sci USA
(2011) 108(4516-4522.
3.
Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich
JK, Gordon JI, Huttley GA et al: QIIME allows analysis of high-throughput community sequencing data.
Nat Methods (2010) 7(5):335-336.
4.
Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, Andersen GL, Knight R: PyNAST: a flexible tool for
aligning sequences to a template alignment. Bioinformatics (2010) 26(2):266-267.
Download