The biogeography of abundant and rare bacterioplankton in the lakes and reservoirs of China Lemian Liu1, Jun Yang1*, Zheng Yu1 and David M. Wilkinson2 1 Aquatic EcoHealth Group, Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, People’s Republic of China 2School of Natural Science and Psychology, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK Supplementary Information Summary The supplementary information includes six supplementary figures and two supplementary tables. Figure S1 Rarefaction curves of similarity-based operational taxonomic unit (OTUs) at cluster distance value of 0.03. Left - the individual samples, right - the combined set of 42 samples. Figure S2 The number of OTUs and sequences of always abundant (the OTUs with > 1% abundance in all samples), normally abundant (the OTUs with > 1% abundance were present > 70% of the samples), Oscillating (the OTUs does not fall in any of the other categories), normally rare (the OTUs with < 0.01% abundance were present > 70% of the samples), and always rare bacteria (the OTUs with < 0.01% abundance in all samples). Figure S3 MultiCoLA profiles based on the dataset-cutoff approaches. The rare OTUs were removed (A-C), and the rare OTUs were retained (D-F) in each truncated dataset. Abundance of OTUs in each truncated dataset (A, D). Non-parametric Spearman correlations comparing the deviation in complete data structure between the original matrix and truncated matrices (B, E). Comparison of most important axes of extracted variation between the original and truncated datasets (C, F). Lacking points are due to sample loss by applying a given cutoff to the original dataset. The red line indicated the threshold of abundant OTUs (32.0%), and the blue line indicated the threshold of rare OTUs (2.7%) in this study, respectively. Figure S4 Spearman’s rank correlation between the Euclidean distance and geographical distance (n is the number of comparison, all ten environmental variables were used, see material and methods). Figure S5 Frequency distribution of abundant and rare bacterial OTUs from 42 inland lakes and reservoirs of China. Figure S6 Community composition of abundant bacterial taxa compared with rare bacterial taxa in 42 lakes and reservoirs of China. Values and error bars indicate mean and standard error, respectively. ns – not significant, * – P < 0.05 (t test). Table S1 Sample information from 42 lakes and reservoirs of China Lake name Lat. (°N) Long. (°E) Depth (m) Region No. of OTUs Bantou R. Dongzhen R. 24.67 25.48 118.02 118.94 8.0 21.4 Hubian R. Shidou R. Tingxi R. Baiyangdian L. 24.50 24.69 24.80 38.94 118.15 118.01 118.14 115.98 6.8 13.6 23.6 1.3 Dongping L. Hengshui L. Hongze L. Luoma L. 35.97 37.62 33.28 34.05 116.19 115.63 118.73 118.22 3.4 1.3 2.3 3.5 Weishan L. Daihai L. Donghaizi L. Hasuhai L. 34.64 40.57 40.63 40.61 117.28 112.67 107.00 110.97 2.2 8.6 1.5 1.5 1223 868 1099 1074 1043 1715 1350 1392 1143 1126 1516 Quansanhaizi L. Shahu L. Shenglihaizi L. Wuliangsuhai L. Xinghai L. Yuehai L. Bei’er L. Huhenuo’er L. Hulun L. Wulanpao L. Amutapao L. Dongxintunnanpao L. Kulipao L. Lamasipao L. Qijiapao L. Tianhu L. Xinhuangpao L. Xinmiaopao L. Yueliangpao L. Chaohu L. Gucheng L. Longgan L. Liangzi L. Nanyi L. Shijiu L. Shengjin L. Taibai L. Taihu L. 41.07 38.83 41.12 40.87 38.99 38.56 47.93 49.30 49.12 48.36 46.61 46.81 45.37 46.29 46.82 46.87 45.63 45.21 45.74 31.52 31.28 29.94 30.24 31.12 31.47 30.39 29.96 31.22 107.87 106.36 107.83 108.79 106.40 106.20 117.70 119.23 117.54 117.52 124.06 124.26 124.50 124.10 124.28 124.40 123.76 124.45 124.00 117.56 118.92 116.17 114.51 118.98 118.89 117.04 115.80 120.14 1.5 2.3 4.0 2.6 1.5 0.9 5.4 1.8 3.2 1.6 2.1 2.0 2.2 2.1 1.8 1.3 1.6 1.8 4.1 3.8 4.2 3.0 3.0 4.4 5.3 5.0 2.3 3.1 FJ FJ FJ FJ FJ ECC ECC ECC ECC ECC ECC IM IM IM IM IM IM IM IM IM IM IM IM IM NEC NEC NEC NEC NEC NEC NEC NEC NEC CJ CJ CJ CJ CJ CJ CJ CJ CJ 1247 816 1165 972 1121 1099 1232 1162 1257 1167 1479 1529 1564 1251 2026 1751 1894 1913 1587 1299 1989 1830 1755 1283 1623 1616 1550 1692 1632 1416 1280 Table S2 General description of all, abundant and rare OTUs data sets at 97% similarity level OTU number Sequence number Chao 1 ACE ALL OTUs 10559 1105524 10791 ± 22 10952 ± 48 Abundant OTUs 143 (1.4%) 751588 (68.0%) Rare OTUs 7598 (72.0%) 29824 (2.7%) Abundant OTUs were defined as the OTUs with an abundance > 1% in a sample and a mean relative abundance of > 0.1% in all samples. Rare OTUs were defined as the OTUs with an abundance < 0.01% in a sample and a mean relative abundance of < 0.001% in all samples.