Background The human microbiome is composed of trillions of organisms occupying the gastrointestinal, genitourinary, and respiratory tracts, the skin, and the oral cavity. Prediction Feature Selection Sex | Run 7 Sex (F=0 | M=1) Influential in immunity, vitamin production, digestion, maintenance of intestinal mucosa. Shifts in microbiota composition may have significant consequences in the homeostasis of the host . Aims Implement supervised latent Dirichlet allocation (sLDA) to describe microbial architecture. Body Site | Run 1 (Skin=0 | Gut=1) Predict host features based on microbiome composition as a function of topics. Body Site Data Open access American Gut Project Operational taxonomic unit (OTU) genomic information from 3832 subjects with information on 194 variables such as sex, diet, and weight. Methods Obesity | Run 1 (Lean=0 | Obese=1) * 1 sLDA 2 Support Vector Machine Random Forest are not directly associated with the two regression figures (Log(OTU) vs. raw BMI). These are the logistic regression coefficients from the sLDA model fit. sLDA fits a maximized set of topic assignments then re* Coefficients gresses the document annotations (in this case 0 or 1 for non-obese or obese, respectively) against the distribution of words assigned to a given topic k for each document d: TZ = Yd, where Yd is a vector of annotations of length D Z is a D x K matrix where K is the number of topics and Zi,j = N1(zn,d = kj, Yd = di) Accuracy