Dear Dr. Sussman, It is with great excitement that I present the resubmission of our manuscript titled “Comparative Genomic Analysis of the Human Fungal Pathogens Coccidioides and their Relatives.” We were pleased with the reviewers' comments and appreciate their insight and guidance in improving the manuscript. All of their suggestions were accommodated and the fundamental changes include clarification of the claims and their support, statistical analysis of gene family size evolution, and clarification of spherule-specific gene identification. In addition, we've improved the presentation of the manuscript by targeting the communication to a general genomic readership. I am pleased to say that, along with addressing all reviewer comments, we successfully reduced the total character count of the manuscript from 86,254 to 59,551 (32%) and removed three figures. Although the fundamental concepts presented in the original manuscript are not altered, the revision is so complete that we are not able to provide a line-by-line analysis of where these reductions in length occurred. In large, the reviewer comments have substantially improved the quality of the manuscript. Included with this letter are point-by-point considerations of the reviewer comments (which follows), our manuscript, and all tables, figures and supplemental information. If there are questions or comments, we are always accessible at the contact information below. Many thanks, Thomas J. Sharpton 321 Koshland Hall Berkeley CA, 94720 Phone: 510-642-8441 Fax: 510-642-4995 sharpton@berkeley.edu for: Jason E. Stajich, Bridget Barker, Garry T. Cole, Malcolm Gardner, Marcin Grynberg, Chiung-Yu Hung, Vinita Jordar, Theo N. Kirkland, Cody McMahan, Rama Maiti, Anna Muszewska, Dan Neafsey, Marc Orbach, Steve Rounsley, Jennifer Wortman, and John W. Taylor The following is point-by-point commentary regarding review comments. Original reviewer comments are italicized while our replies are unitalicized. REVIEWER 1 COMMENTS Sharpton et al present a comparative genomic study of two species pathogenic fungi (Coccidioides sp.) along with related species of non-pathogenic fungi. Through sequences analysis they identify families of genes that have expanded/contracted in the lineages that lead to the pathogenic species. They also identify genes that show increased rates of substitution in the pathogenic lineages and are candidates for positively selected genes. The main claims of the paper are that the evidence suggests “that Coccidioides species are not soil saprophytes, but that they remain associated with their dead animal hosts in soil,” and that the study has identified "genes and proteins associated with the evolution of pathogenicity in Coccidioides." In my opinion neither claim is well supported. The evidence is suggestive, but certainly not conclusive. In any comparative genomic study of any set of species one will always find gene family expansions and contractions and evidence for fast evolving genes. What is the evidence that these observations have anything at all to do with pathogenicity? Our questions concern the adaptation of Coccidioides to growth with animals in nature and to pathogenicity of humans. Here, we have compared genomes to find evolutionary evidence to support hypotheses identifying genes involved in these two adaptations. We feel that the gene expansions and contractions related to Coccidioides (and Onygenales in general) shifting from plants to animals provide very strong evidence to support the hypotheses that proteases are involved in the shift and have made this point more clear in the manuscript. Regarding the case of pathogenicity, genes that we have identified by rate of evolution, selection and level of expression include those identified by others, using different approaches, as important to pathogenicity in fungi. The reviewer’s complaint would apply to any comparative genome study. To fully answer the reviewer, we and others will need to manipulate the genomes of the fungi involved. We are working on this research, but we believe that the field will advance more rapidly if we publish our findings now to attract many labs to the task of testing the hypotheses. The gene expansions/contractions are the same in all the Onygenales lines and are not specific to pathogenic Coccidioides. The leap from the observation of expanded gene families to the ecology of these fungi is a long one. The reviewer’s comments were very valuable in pointing out the lack of clarity in our original manuscript. There are two parts to our manuscript: one on the shift from plants to animals by Onygenales, and a second on pathogenicity in Coccidioides compared to other Onygenales. We now make it clear that the gene family expansions and contractions shared between Coccidoides and U. reesii support the shift from plants to animals, but they do not support the gain in pathogenicity in Coccidioides. However, our data on rate of evolution, selection and transcription support our hypotheses about the gain of pathogenicity in Coccidioides. If one were to pick any two species from this analysis and compare them to the other species it is likely that one would again find gene families that have expanded/contracted and a small set of genes that show increased rates of substitution. This is what we expect when making a large number of comparisons across genomes. The reviewer is incorrect regarding this point. There are only two significant Coccidioides gene family expansions in the entire 12-genome data set. The most notable being a family that codes for enzymes that attack animal protein and correlates evolutionarily with the shift from plant-based nutrition to animal-based nutrition. If we picked any two fungi from our pool, we would not necessarily see any significant evolutionarily-related gene family expansions and certainly picking any two at random are unlikely to identify the only two significantly expanded in Coccidoides. To me this paper is really more of a "data dump" than an in depth analysis of the evolution of pathogenicity in Coccidioides. No doubt the sequence will serve as a valuable resource to those working in this field. Again, we have two points in the manuscript, which now are clearly identified, one about the adaptation of fungi to animal substrates from a history of plant substrates, and a second about pathogenicity. We have focused the data presented to address these two points. Rest assured that it is not a “total data dump,” particularly after the 1/3 reduction in length. But the relevant data could be presented in a much more condensed form. I think the paper is three times as long as it needs to be. As mentioned, we have shortened the manuscript length substantially and underscored the claims in question. It would be fine as a short communication describing the resource and presenting two small tables of expanded/contracted gene families and a list of potential genes under positive selection. Everything else, including speculation about the ecology of these fungi, is superfluous. In the end, and this is just my opinion, this manuscript will only be of interest to those working on these particular pathogenic fungi. Ours is the first comparative genomic study using such a large group of fungal genomes with the themes of adaptation to a shift in nutrition and to pathogenicity. It is the first publication of four, new, complete, evolutionarily related eukaryotic genomes. We feel that it is much more than a short-communication and that it will inspire much research in both comparative genomics of other groups of fungi and testing hypotheses raised about Coccidioides. Minor technical point: The entire first paragraph of the results is dedicated to describing the different number of genes in the two closely related Coccidioides species. But then we are told that this difference is likely due to the different annotation methods used on the two different genomes. It seems very sloppy to use two different annotation methods on two closely related genomes that are to be compared. It also, in my opinion, diminishes the usefulness of the resource to other groups. Again, we thank the reviewer for pointing out places where our manuscript was not clear. As is now featured in the manuscript, because of the differences in annotation, we considered only orthologs shared between the two species (for the very reason the reviewer has mentioned). In addition to this conservative approach, we checked our key finding of gene family expansions and contractions in each genome independently, finding no significant differences. Of the 2,798 total gene families considered, only 36 have a size difference greater than 1 between C. immitis and C. posadasii and only 14 have a size difference greater than 2 between these taxa. None of these aforementioned differences are large enough to affect the statistical analysis of gene family size evolution in Coccidioides REVIEWER 2 COMMENTS The authors report their analysis of two pathogenic Coccidioides stains and a closely related non-pathogenic relative Uncinocarpus reesii, and a more distantly related pathogenic fungus Histoplasma capsulatum. These four Onygenales genomes are also compared to 13 Pezizomycete fungi. The results of their analyses help to shed light of the evolutionary modifications necessary for pathogenesis in the Coccidioides. They also suggest that, rather than being soil saprophytes, the pathogenic Coccidioides remain associated with their dead animal hosts while in the soil- a hypothesis consistent with the authors analyses. Overall is this a fine paper, likely to attract interest from a broad spectrum of readers interested in fungal evolution and comparative genomics in general. It also makes for a nice reference for readers interested in state-of-the-art comparative genomics techniques. The paper is quite well written, and generally accessible to an audience with limited mycological training; but a bit more background on the phylogenetic relationships of the various organisms would increase readership. As previously mentioned, we have incorporated more information about the relationship and differences between the taxa referenced in this study throughout the manuscript. Major points. The authors suggest that the differences in gene numbers in the two Coccidioides species are likely the result of different annotation protocols. They report that C. immitis has 10,355 and that C. posadasii has 7,229 genes. Interesting 9,996 (97%) C. immitis genes have clear partners in posadasii. Isn’t this finding also consistent with the C. immitis genes being split into pieces relative to the posadasii ones? Not across contigs, but rather split due to some tendency in the annotation procedure to split a single gene into two or more genes in C. immitis? This seems a better explanation of the facts than the explanation offered by the authors that only posadii genes with EST or protein support are included in its gene set. I’d like to see this point addressed in more detail in the final version of the paper. This is an excellent point and we have incorporated it accordingly in the results with the following passage: “While similar in size, these genomes differ in the number of annotated genes with 10,355 in C. immitis and 7,229 in C. posadasii (Table 1). This variation most likely results from the use of different annotation methodologies by the sequencing institutions (see Methods). In particular, gene splitting and fusion occurred during annotation, as evidenced by the 9,996 C. immitis genes that have BLASTN hits with greater than 90% identity in the C. posadasii genome. The comparative analyses presented here employed a conservative approach, considering only those genes annotated in both species.” Regarding Spherule specific genes the authors state, “We compared expressed sequence tag (EST) libraries generated from Coccidiodes growing as mycelia or as spherules.” How was this done? I can’t find it mentioned in the materials & methods. How significant (statistically) are the differences between the transcripts in the two libraries? We have added a section to the methods regarding the identification of spherule specific genes. Specifically, ESTs from Coccidoides grown in the hyphal and spherule (pathogenic) stages were compared. We employed an absolutist method in such that putatively spherule-specific genes were those with ESTs present only in the spherule stage. The following is the text we've added: “The 53,664 C. posadasii ESTs, (26,201 from mycelium and 27,463 derived from in vitro spherules) and the 65,754 C. immitis ESTs (22,382 from mycelia and 43,372 from in vitro spherules) were aligned to annotated genes via BLAT (Kent 2002). Evaluating the morphology from which the aligned ESTs were derived enabled classification of gene expression condition. Putative spherule specific genes were thus identified as those that matched at least one spherule derived EST and no mycelial ESTs.” Minor points. The Authors point out that the two Coccidioides genomes differ in repeat content. From the description on the Results section it appears that this is all simple sequence repeat—not transposons—but it is hard to be certain from the text; clarifying this minor point would strengthen the manuscript. To be clear, this discussion is not referring to simple sequence repeats, but large, interspersed (putatively transposable) sequences. We have clarified this position in the results: “Although the non-repetitive sequence of these genomes differs only by 400 kb, there is a large difference in repetitive DNA (C. immitis 17%, C. posadasii 12%) that accounts for an additional 1.84 Mb of long, interspersed repetitive sequence in C. immitis (Table 2).” With respect to gene gain and loss: how sensitive are these results to mis-annotation? Especially given the caveats I mentioned above regarding split genes? Some mention of this issue in the Results/Discussion would improve the manuscript. We understand the confusion on this point and have reinforced the following argument in the manuscript. Since this analysis only considers ortholog clusers where both Coccidioides species are represented, it is conservative and, if anything, underestimates the true number of gains and losses (missed gene in one of the two species). The minimal difference in the number of losses between the two Coccidiodies species is due to the fact that a very small number of ortholog clusters contains more than one gene copy, either due to a true paralogy event since speciation or gene splitting/fusion. However, as noted, this difference is negligible and no conclusions are drawn from any of these genes. From our manuscript: “Utilizing H. capsulatum as an outgroup, a parsimony-based analysis of 4,460 orthologs reveals that the U. reesii lineage experienced a greater rate of gene turnover (1,075 gains and 399 losses) than the Coccidioides lineages (280-291 gains and 109 losses). However, to account for annotation differences, Coccidioides gene gains and losses were only considered if both species had at least one copy of the ortholog. Thus, these findings may represent an underestimate of the total gene turnover rate in either species.” The time line of the evolution in pathogenic phenotype in figure 8 is a bit doubtful, since I don’t see why acquiring virulence should come after the divergence of Coccidiodes from U.reesii. The author did not rule out the possibility that the pathogenic phenotype emerged even before the divergence of U.capsulatum, and U.reesii lost it while the other two taxa retained. A discussion of this issue would strengthen the manuscript. Though the reviewer’s hypothesis is a possibility, parsimony argues against it, as all known, closely related taxa between Coccidioides and U. reesii are non-pathogenic. Either many lineages independently lost the virulent phenotype, or Coccidioides and H. capsulatum independently acquired it. Certainly it is not known, but given the data in hand, ours is the strongest hypothesis. The argument that gene family size changes in Coccidioides and U.reesii preceded the divergence of these two taxa, is not entirely convincing. If the two taxa diverged before the gene family expansion or contraction, they might still have similar gene-family sizes due to convergent evolution, especially because they have similar niche. To really persuade me, the authors should include more quantitative data to show the level of similarity in gene family size between Coccidioides and U.reesii versus some outgroups. The reviewer is wrong on this point. His or her hypothesis would be acceptable except that the phylogenies of the gene family expansion show a cluster of the two Coccidioides species and Uncinocarpus at each duplication. If the duplications were independent, there would be a cluster of Uncinocarpus duplicates independent from a cluster of Coccidioides duplicates. Our data clearly show the former and not the latter. The authors report a combination of window and gene-based analyses to identify 53 species-specific gene-containing regions. A little more detail about how these regions are defined, as well as just what is meant by the term ‘species-specific region’ would be helpful. If the definition of ‘species-specific’ is based on sequence-homology, the criterion to distinguish specific vs. non-specific should be noted. We now clarify the detection and definition of these regions in the results. We indeed used a homology-based method, where 1kb windows of the genome were compared to the other Coccidioides genome. If the top hit contained fewer than 500bp, than the window was considered unique to the species in question. As per the manuscript: “Species specific differences in chromosome structure were identified where less than 500bp of homologous sequence was found when comparing 1kb windows across the C. immitis and C. posadasii genomes.” A bit more information about coccidioidomycosis, along the lines of the Wikipedia page would strengthen the manuscript. We, once again, heed the advice of the reviewer and have incorporated additional information, albeit brief given the space constraints, about coccidioidomycosis to the introduction. “As the causal agent of coccidioidomycosis or ‘Valley Fever’, Coccidioides infects at least 150,000 people annually, approximately 40% of which develop a pulmonary infection (Hector et al. 2005). However, a chronic and disseminated form of coccidioidomycosis, for which existing treatments can be prolonged and difficult to tolerate, occurs in roughly 5% of patients (Galgiani et al. 2005). The virulent nature of this fungus and its potential for dispersal by airborne spores led to its listing as a U.S Health and Human Services Select Agent (Dixon 2001) and has fueled efforts to develop an effective vaccine and new treatments (Hector et al. 2007; Hung et al. 2002).” The discussion of RIP in the Discussion section seems to come from nowhere. Mentioning these analyses in the Results would avoid this non sequitur. We corrected this communication gaff by placing the discussion of RIP in the context of the identification of a CpG dinucleotide bias in Coccidioides long, interspersed repetitive elements: “Repetitive genomic sequences in Coccidioides exhibit a biased nucleotide and dinucleotide composition, a phenomenon observed in other filamentous fungi that are subject to RIP (repeat induced point-mutation).” REVIEWER 3 COMMENTS The aim of this study was to identify genomic changes that could be associated with the evolution of the life history and virulence of the human pathogenic fungi Coccoidioides immitis and C. posadasii. The genomes of C. immitis and C. posadasii, together with the closely related non-pathogenic fungus Uncinocarpus reesii and the more distantly related human pathogenic fungus Histoplasma capsulatum were sequenced. These four species belong to the Onygenales clade. Through a hierarchical procedure starting with a broad comparison of the four Onygenales genomes with those of other more distantly related ascomycetes, followed by comparison at an intermediate evolutionary distance among the four Onygenales genomes and finally with a comparison between the genomes of C. immitis and C. posadasii, changes in gene family sizes, gains and losses of genes, rapidly evolving genes and genes under positive selections were identified. This is a very interesting and well done study. The approach using hierarchial comparative genomics, contrasting distant as well as closely related species is elegant. It is the first study that in detail analyzes the genomic changes that can be involved in the evolution of parasitism in fungi. I have a few comments that should be considered by the authors. 1. Gene family expansions and contractions. I would like to see “more global” data (supported with statistics) on the number of expanded/contracted genes families among the compared taxa. How many (and which) families were significantly expanded/contracted? This is an excellent suggestion and we now include this information in the results section of the manuscript: “A phylogenetic analysis conducted on 2,798 gene families identified significant changes in gene family size between the Onygenales and their sister order, the Eurotiales, which are primarily associated with plants, as are the outgroup taxa. There are 1, 043 families that are conserved in size between the orders. Evaluating those families with size changes with the statistical testing method employed by (De Bie et al. 2006) reveals 13 that are significantly smaller among the Onygenales and two that are significantly larger (Figure, p < 0.05).” 2. Identification of Spherule specific genes. What method (statistical testing?) was used to identify the 1201 genes with putative spherulespecific expression? This point was addressed in response to Review 2. 3. Shortening of the manuscript. I think that both the Result and Discussion paragraphs could be shortened. Fig. 5 can be put in supplementary material. SI4 (the phylogeny) could be removed. As above, the manuscript has been significantly shortened. 4. Supplementary material: Explanatory legends/heads are lacking to several of the supplementary files. The texts needs to be checked for editing (e.g. species names in italics, format of references both in text and in the list of references) This has also been corrected during revision. All supplemental files have been given legends and the text has been carefully evaluated for typos. 5. Editing of references. The list of references contains a number of type errors that needs to be corrected. Type errors in the references have been revised. 6. Legend to Fig. 2. (page 37) The citation to SI9 can not be correct. This has also been fixed.