CPN60 Gene Conservancy Via Protein-Sequence Comparison Across Extremophilic Microorganisms Richard Barrera Department of Biological Sciences Saddleback College Mission Viejo, CA 92692. Abstract Microorganisms can survive and flourish in environments which are detrimental to the majority of life on the planet. Research focused on final polypeptide modification in extremophilic microorganisms, comparing protein production as in more advanced, eukaryotic cells. I hypothesize that effective protein modification-management is an integral step in early cellular ; therefore the sample organisms will demonstrate a high level of cpn60 gene converservancy. This will provide evidence for the shared ancestry and interrelatedness of organisms separated by millions of years of evolution; the data from this study and recently published literature also hint at how prion and neurodegenerative diseases may be propagated. Specifically, this study searched for the presence of the groEL protein across twelve sub-classes of extremophiles. The groEL protein is a 60 kilo-dalton protein, which makes up the large subunit of the groES/L complex and is the primary Type I chaperonin used for final polypeptide modification and maintenance in thousands of organisms. I searched for the presence of the groEL protein, and its exon: cpn60, via amino acid comparison using the National Center for Biotechnology Information’s (NCBI) Basic Local Alignment Search Tool (BLAST) server. Homo sapiens groEL amino acid sequence was used as target query in order to demonstrate similarity between divergent species. N = 24 extremophiles were identified as candidates for protein comparison, selecting one to three members per sub-class; twenty were sequenced and obtainable via public record; seventeen organisms demonstrated ≥ 40% positive match, fourteen demonstrated ≥ 70% positive match, and three organisms used a separate Type I chaperonin. Data suggests homologous toroid chambers for facilitation of post-translational polypeptide modification are extremely prevalent among the majority of organisms on the planet. Introduction It is now known that molecular chaperones participate in a large variety of cellular functions. They assist in de novo protein folding, stabilize proteins under duress and maintain polypeptide chain components in a loosely folded state for translocation across organelle membranes (Kumarevel, et al. 1998). Research focused on the presence of the groEL protein, contained in the cpn60 exon; the 60 kD protein is the large subunit of the groEL/S complex, in extremophilic microorganisms. The groEL/S complex is one of the primary Type I chaperonins used for final polypeptide modification, and among other responsibilities, handles the folding of monomeric mitochondrial rhodanese (Mendoza, et al. 1991). The target query was the five hundred and four character sequence of the groEL protein in Homo sapiens. The target region is believed to be a universal target of about five hundred and fifty five bp, and has been found to be a robust target for species-level characterization of bacteria, archaea, and eukaryotes (Hill, et al. 2012). The presence of the protein sequence among examined microorganisms illustrates a shared requirement among divergent species for post-translational polypeptide modification in order to sustain basic cellular function. This is significant because in recent years the scientific community has discovered life across a multitude of environments that were heretofore believed to be uninhabitable: these chaperones, in conjunction with stress-induced shock proteins, act as an efficient protein management system, preventing the aggregation of denatured proteins within the cell and programmed cell death. This has led researchers to reconsider the pervasiveness of life and its ability to adapt, colonize, and thrive in extraordinarily demanding environments. Conversely, prion and neurodegenerative diseases are often the result of malfunctioning chaperones within the cytoplasm or intermitochondrial matrix respectively, which results in protein aggregation and cell death (Schon and Manfredi, 2003). Materials and Methods Research was conducted via the following: Thirty microorganisms across twelve subclasses of extremophiles were identified using publicly available online databases. Identified microorganisms were vetted using the Kyoto Encyclopedia of Genes & Genomes and GenBank in order to identify whether or not the complete genome had been sequenced and published; incomplete sequences are not available for comparison and were eliminated from the population. Genome identification continued until a population of N ≥ 20 was reached. Amino acid examination was conducted via protein comparison using the National Center for Biotechnology Information’s (NCBI) Basic Local Alignment Search Tool (BLAST) server. Percentage conversancy of the amino acid sequence was calculated between the species using the NCBI Graphic Representation Tool. A cladogram was constructed using the NCBI Phylogenetic Tool in order to visualize the divergence between species. Results Twenty four species of extremophiles across twelve sub-classes were identified as candidates for comparison. Of the twenty four species identified, twenty microorganisms representing ten groups are sequenced and available on the National Center for Biotechnology Information’s database. I could not locate sequenced organisms belonging to piezophiles or xerophiles. The targeted amino acid sequence of the groEL protein was obtained from Homo sapiens, and contains five hundred and four characters: MLRLPTVFRQMRPVSRVLAPHLTRAYAKD VKFGADARALMLQGVDLLADAVAVTMGP KGRTVIIEQSWGSPKVTKDGVTVAKSIDLK DKYKNIGAKLVQDVANNTNEEAGDGTTT ATVLARSIAKEGFEKISKGANPVEIRRGVM LAVDAVIAELKKQSKPVTTPEEIAQVATISA NGDKEIGNIISDAMKKVGRKGVITVKDGK TLNDELEIIEGMKFDRGYISPYFINTSKGQ KCEFQDAYVLLSEKKISSIQSIVPALEIANA HRKPLVIIAEDVDGEALSTLVLNRLKVGLQ VVAVKAPGFGDNRKNQLKDMAIATGGAV FGEEGLTLNLEDVQPHDLGKVGEVIVTKD DAMLLKGKGDKAQIEKRIQEIIEQLDVTTS EYEKEKLNERLAKLSDGVAVLKVGGTSD VEVNEKKDRVTDALNATRAAVEEGIVLGG GCALLRCIPALDSLTPANEDQKIGIEIIKRTL KIPAMTIAKNAGVEGSLIVEKIMQSSSEVG YDAMAGDFVNMVEKGIIDPTKVVRTALL DAAGVASLLTTAEVVVTEIPKEEKDPGMG AMGGMGGGMGGGMF Homo sapiens amino acid sequence was selected for comparison in order to punctuate evolutionary conservancy between divergent species. Species were entered into the database, and seventeen organisms were found to contain a similar target sequence. The remaining three organisms, members of the archaea domain, utilize the DnaJ Type I chaperonin in order to complete post-translational polypeptide modification. The seventeen species of extremophiles containing the cpn60 protein sequence demonstrated a ≥ 40% positive match, fourteen demonstrated a ≥ 70% positive match, as shown in Figure 1. Figure 1A gives a graphic representation of the matching sections of genome, generated by the NCBI. A visual representation of genetic divergence is depicted in Figure 2. The expected value (E-value) was ≤ 3.00x10-11. Discussion The E-Value represents background noise, or the percent likelihood that a false positive will be encountered in the query sequence. The subclasses included in the research were: acidophiles, alkaliphiles, cryptoendoliths, osmophiles, lithoautotrophs, metallophiles, oligotrophs, piezophiles, psychrophiles, radiophiles, thermophiles, and xerophiles. This data is significant because it confirms that life is predicated on the proper function, maintenance, and destruction of proteins. Cells cannot function without a form of intermediary Identical Match % chamber which allows polypeptide chains to assume , resume, or degrade their tertiary structures. As such, the evolution of chaperonins was an integral and promethean step in the evolution of life on Earth., any chaperonin mutation which alters its interaction with hydrolysable ATP binding, or alters the protein-modification chamber in such a way as to produce a renegade protein, may result in significant havoc and ultimately cell death (Walters et al. 2002). For example, research has found that the malfunctioning of oxidative phosphorylation pathways in mitochondria leads to the excess generation of reactive oxygen species. These species decimate the mitochondria, altering the structure of whatever they come in contact with, including chaperonins. These altered chaperonins can no longer fulfill their duties of protein maintenance, and as a result, the mitochondria self-destructs (Mukherjee and Chakrabarti, 2013). Positive Match % Identical Match Expect value Sphingopyxis alaskensis 56.16 78.36 233 0 Saccharomyces cerevisiae 56.98 75.23 224 0 Wallemia ichthyophaga 56.5 72.74 235 0 Debaryomyces hansenii 56.91 76 231 0 54.7 76.69 238 0 Nitrosomonas sp. AL212 52.37 74.19 250 0 Cupriavidus metallidurans 53.77 74.15 244 0 Acidithiobacillus ferrooxidans 51.04 74.38 257 0 Thiobacillus denitrificans 53.02 73.4 247 1.00E-180 Pelagibacter ubique Flavobacterium psychrophilum JIP02/86 51.7 73.3 251 1.00E-179 Bacillus subtilis 51.33 73.19 253 1.00E-179 Amphibacillus xylanus 50.57 72.62 257 9.00E-179 48.7 70.26 264 1.00E-163 Pyrolobus fumarii 1A 23.36 44.86 340 1.00E-017 Methanopyrus kandleri AV19 23.23 42.47 323 5.00E-017 Methanococcoides burtonii DSM 6242 23.03 43.07 340 2.00E-015 Sulfolobus solfataricus P2 22.02 41.74 352 3.00E-011 Deinococcus radiodurans Figure 1. Hit table generated by BLAST data analysis. Percent match is in relation to Homo sapiens template code. Identical match is number of correct chemical and spatial amino acid matches. Chart is ordered by E Value (most certain match to least). Figure 1A. Graphical representation of hit table from Figure 1. Red areas indicate identical matches, grey areas indicate positive matches. Organisms are listed in the same order as Figure 1. Figure 2. A phylogenetic tree based on genetic divergence of 504 character amino acid sequence; present are the seventeen microorganisms sampled which contain the cpn60 gene. Review Form Department of Biological Sciences Saddleback College, Mission Viejo, CA 92692 Author (s):_Richard Barrera____ Title: CPN60 Gene Conservancy Via Protein-Sequence Comparison Across Extremophilic Microorganisms Summary Summarize the paper succinctly and dispassionately. Do not criticize here, just show that you understood the paper. This study focuses on the relationship between protein modification-management and the CPN60 gene. The researchers used NCBI to do a protein comparison between twenty four microorganisms (extremophiles) whose complete genomes had been sequenced. Homo sapiens amino acid sequence was also used as a comparison with the microorganisms. When they were compared to Homo sapiens, seventeen species showed a match that was greater than 40%, and fourteen showed a match greater than 70%. Any statistics that were run for this study were unclear or absent, so the significance of the data was difficult to determine. According to the study, the data is significant, and shows that protein modification-management is essential to survival in extreme environments, and thus conservancy of the CPN60 gene is high in these organisms. The significance of the data in relation to Homo sapiens was unclear. General Comments Generally explain the paper’s strengths and weaknesses and whether they are serious, or important to our current state of knowledge. The study seems very interesting, but the overall goal and results of the study were a little unclear. The hypothesis states only that organisms in the sample would have “high CPN60 gene conservancy” but it’s difficult to discern which organisms are being referred to. The introduction and abstract discuss microorganisms (extremophiles), but the results and discussion involve a comparison with Homo sapiens. I also was a bit confused about the statistics that were run for this study. Maybe the program they used ran a different type of statistical analysis than we learned in class, but I didn’t see any statistical values that show that the data is significant. There was a sentence in the results about the percent match that some of the organisms had with Homo sapiens. Is this the statistical analysis? Overall it was difficult to tell exactly what this study was about and what experiment was carried out. The paper was also missing a Literature Cited section and the figures were not placed in the results section (they were at the end of the paper/in the discussion section). Not sure if this is some kind of formatting error occurred, but it made the paper look unfinished (like a rough draft). Technical Criticism Review technical issues, organization and clarity. Provide a table of typographical errors, grammatical errors, and minor textual problems. It's not the reviewer's job to copy Edit the paper, mark the manuscript. This paper was a final version Recommendation This paper should be published as is This paper should be published with revision This paper should not be published This paper was a rough draft