SUPPLEMENTARY MATERIALS Supplementary results: Burkholderia and Bradyrhizobium were represented by Burkholderia fungorum and Bradyrhizobium pachyrhizi respectively. In the original analysis using QIIME, 6,689 sequence reads were classified to the genus Burkholderia. For species identification, we analyzed their relationship to 16S rRNA genes of 59 Burkholderia species on a phylogenetic tree and found all except 4 of the reads were clustered to Burkholderia fungorum (Supplementary Figure 1A). To confirm their relationship to B. fungorum, we amplified full length 16S rRNA genes from a duodenal sample from one of the HIV-infected patients using universal primers 8F and 1510R. The PCR products were ligated with the pGEM T Easy (Promega) and cloned in E. coli DH5α competent cells as previously described[1]. One hundred and three clones were initially sequenced using primer 8F and classified using SEQMATCH in RDP II[2]. Clones that were classified as either Burkholderia or Bradyrhizobium were bidirectionally sequenced with the vector-based primers T7 and SP6. Sixteen of the 103 clones were found to belong to Burkholderia, of which 14 were identical to and 2 had only one mismatch with a 16S rRNA gene of the B. fungorum type strain LMG 16225 (GenBank accession number AF215705). QIIME assigned 432 reads in the sequence dataset to Bradyrhizobium. Initial phylogenetic analysis with the V3/4 (366-784) regions of the type strains of 14 Bradyrhizobium species indicated that nearly all reads (n=411) formed a single cluster with 4 species, i.e. B. elkanii, B. jicamae, B. lablabi, and B. pachyrhizi (Supplementary Figure 1B). Similarity search using BLAST showed that the reads could not be further classified because the four species are identical in this region. To distinguish these species, we found three of the 103 clones aforementioned belong to Bradyrhizobium and their full length 16S rRNA gene sequences were closely matched with a 16S rRNA gene of the Bradyrhizobium pachyrhizi type strain PAC48 (GenBank accession number AY624135) with only one, two, and four mismatches over the 1501-nucleotide sequence, respectively (Supplementary Table 2). We classified this taxon as Bradyrhizobium pachyrhizi because this degree of mismatch (0.070.13%) is comparable with the average diversity (0.55%) among 16S rRNA gene copies within individual prokaryotic genomes[3] but far below either the popular threshold of 3% or the more strict threshold of 1% used to define species[4]. Supplementary Figure 1. Phylogenetic relationship between 16S rRNA gene sequence reads of Burkholderia and Bradyrhizobium and 16S rRNA genes of type strains. Phylogeny was estimated by neighbor-joining of nucleic acid pairwise distance based on the analysis of the 419-bp region of 16S rRNA genes corresponding to the positions 366 to 784 of the E. coli 16S rRNA gene. Numbers represent percentage bootstrap support (1,000 replicates). Scale bar indicates nucleic acid substitutions per site. For Burkholderia, a phylogenetic tree constructed using 16S rRNA genes of type strains representing 57 Burkholderia species and 6,785 sequence reads oriiginally classified to Burkholderia by QIIME (A). For Bradyrhizobium, a phylogenetic tree constructed with type strains of 14 Bradyrhizobium species and 432 sequence reads originally classified to Bradyrhizobium by QIIME (B). Supplementary Figure 2. Difference between HIV-infected patients and controls in the duodenal microbiome by Gram-stain property and major phyla. Taxa were grouped based on tinctorial properties of bacteria as Gram-positive or Gram-negative, or unclassified (not shown) and compared between HIV-infected patients and controls (A). Phylum level compositions between HIV-infected patients and controls were focused on the five major phyla that account for more than 95% (B) of the duodenal bacteria population (C). P values were calculated using Mann–Whitney U test. Supplementary Figure 3. Correlation between the relative abundance values of B. fungorum and B. pachyrhizi and blood CD4+ T cell counts. Taxa informative of HIV infection were further analyzed at each of the three specific sites of the proximal gut including the duodenum (A), stomach (B), and esophagus (C) as well as the mouth (D) as a reference. X axis denotes the CD4+ T cell counts/mm3. The dashed line demarcates the threshold of normal CD4+ T cell counts (500/mm3). The assessment was done with Spearman's rank order correlation. Supplementary Table 1. Subject characteristics Category HIV-1-infected Subjects (range) 8 36.5 (24-50) 5/8 (62.5%) 327.5 (12-708) 835.4 (267-1462) 0.4 (0.1-0.8) 23,729 (9,090-49,700) 2/8 3/8 2.1 (0.16-7) 0/8 8/8 1/8 Uninfected subjects 8 47.6 (25-60) 4/8 (50%) n/a n/a n/a n/a 0/8 0/8 n/a n/a 0/8e 0/8 Number of subjects Age (years) Gender: male (%) CD4 count (cells/mm3)a CD8 count (cells/mm3)b CD4:CD8 ratioc Plasma viral load (HIV-1 RNA copies/ml) Upper GI symptoms (nausea, vomiting) Lower GI symptoms (watery diarrhea) Years since first HIV-seropositive test HAARTd HIV positive test Antibiotics treatmentf Race: White 3 (37.5%) 1 (12.5%) African American 5 (62.5%) 6 (75.0%) Asian 0 1 (12.5%) 3 a. Normal range CD4+ T cells: 500-1000 cells/mm . b. Normal range for CD8+ T cells: 150-1000 cells/mm3 c. Normal range for CD4:CD8 ratio: 0.9 to 3.7 in adults. d. No highly active antiretroviral therapy for ≥6 month prior to specimen collection. e. OraQuick ADVANCE Rapid HIV-1/2 antibody test. f. Of the 8 HIV+ subjects, seven had no antibiotics treatment for at least 8 weeks before specimen collection while one was on antibiotics for syphilis. Supplementary Table 2. Number of positions differed between cloned near full length 16S rRNA genes and those of closely related Bradyrhizobium species over a 1,356-bp overlapping region. B. pachyrhizi B. elkanni B. jicamae B. lablabi Clone 3 Clone 42 Clone 79 Dissimilarity (%) 2 1 4 0.07-0.29 5 4 7 0.37-0.52 12 11 6 0.44-0.88 10 9 6 0.44-74 Supplementary Table 3. Correlation between CD4/CD8 ratio and bacterial taxa in HIV-infected patients. CD4/CD8 ratio Taxa Mouth Esophagus Stomach Duodenum p value p value p value p value r2 r2 r2 r2 <0.0001 0.693 0.0003 B. fungorum absent absent 0.091 0.066 0.722 B. pachyrhizi 0.846 <0.000 0.091 0.222 0.146 0.244 0.0003 0.721 Ralstonia absent absent 0.091 0.148 0.139 0.045 0.0007 0.729 Fusobacterium 0.493 0.041 0.665 0.003 0.823 0.010 0.058 0.479 Lactobacillus 0.078 0.041 <0.0001 0.322 0.352 0.322 absent absent Prevotella 0.693 0.041 0.243 0.384 0.779 0.064 0.102 0.376 Supplementary references 1. 2. 3. 4. Pei Z, Bini EJ, Yang L, Zhou M, Francois F, Blaser MJ. Bacterial biota in the human distal esophagus. Proc Natl Acad Sci U S A 2004,101:4250-4255. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 2007,73:5261-5267. Pei AY, Oberdorf WE, Nossa CW, Agarwal A, Chokshi P, Gerz EA, et al. Diversity of 16S rRNA genes within individual prokaryotic genomes. Appl Environ Microbiol 2010,76:3886-3897. Stackebrandt. Taxonomic parameters revisited: tarnished gold standards. Microbiol. Today 2006,2006:153-155.