METABOLIC PATHWAYS & GENOMICS File: genome&p

DATABASES & PATHWAYS File name DATABASES AND PATHWAYS 2012 INTRODUCTION ● Genome, transcriptome, proteome, phenome (mutant phenotype), and biochemical / metabolic pathway databases and their associated tools offer powerful ways to investigate metabolism. ● Genomics-driven approaches (‘database mining’) complement classical biochemical approaches to the metabolism of all organisms, including plants. Sequence and expression information - from genomes, transcriptomes, proteomes, etc – complements biochemical information in several ways: 1. Identifying genes for plant enzymes. Because enzymes (and some transporters) are conserved, homology searches (with BLAST programs) using prokaryotic, yeast, or animal sequences as query can identify the corresponding plant proteins, and show whether they are encoded by single genes or gene families. Searching plant genomes in this way can show which enzymes are present and which are absent. This in turn allows ‘metabolic reconstruction’, i.e. predicting metabolic capabilities (the metabolic pathways that are present) from DNA sequence data alone. - Plant sequences can be expressed heterologously (e.g., in E. coli or yeast, with a tag to facilitate purification), and the recombinant proteins can be characterized. This is especially useful for lowabundance or unstable proteins, which are difficult or impossible to isolate from plants in sufficient amounts for study. - The functions encoded by plant sequences can be investigated using functional complementation in microorganisms. 2. Predicting organellar targeting, localization in membranes. Genomic sequences, cDNAs, and ESTs and can give information about the organellar targeting of enzymes, via their characteristic signal sequences, and about whether proteins have membrane-spanning domains and hence are likely to be located in membranes. Organellar proteome databases can provide high-throughout experimental support for these predictions. Knowing organellar location can rule in or out possible metabolic functions of proteins. 3. Predicting biochemical function from expression data (microarrays, RNAseq). When, where, and at what level a gene is expressed can likewise provide clues about function. Correlated patterns of gene expression (‘co-expression’) in relation to development, environment, or genetic changes (e.g., knocking out or overexpressing genes) can point to related function. 4. Predict missing enzyme or transporter genes and predict new gene functions by comparative genomics. By looking for functional linkages among genes in bacteria and archaea (gene fusions, conserved gene clusters, and co-occurrence patterns) it is possible to: - Identify enzymes and transporters that are ‘missing’ from known pathways - Discover new enzymes, pathways, and processes. Having found a new prokaryotic enzyme by this approach, its counterpart can be sought in plants via homology searches. Conversely, if an unknown plant enzyme has prokaryotic homologs, comparative genomic analysis of the latter can help predict the function of the enzyme in both groups. This is a powerful approach because prokaryotes share many pathways with plants. ******** This part of the course introduces web resources needed to extract the above types of information, and illustrates how to use them. BASIC RESOURCES NCBI http://www.ncbi.nlm.nih.gov/ Entrez nucleotide and protein data bases; Blast similarity search programs. CD-Search http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi NCBI Conserved Domain Database. Well-annotated models for ancient domains and full-length proteins. Example: >Ureaplasma urealyticum ATP synthase C chain (EC 3.6.3.14) MSSFIDITNVISSHVEANLPAVSAENVQSLANGAGIAYLGKYIGTGITMLAAGAVGLMQGFSTANAVQAVARNPEAQPKILSTMIVG LALAEAVAIYALIVSILIIFVA Click on ‘Search for similar domain architectures’ button to access CDART tool Pfam http://pfam.janelia.org/ Pfam protein families and domains database. Click on ‘Sequence search’, click on hit(s), then click on ‘Domain organisation’ Multalin Sequence Alignment http://bioinfo.genotoul.fr/multalin/multalin.html Aligns protein or DNA sequences (output in color) and draws simple phylogenetic trees. ExPASy Translate Tool http://www.expasy.ch/tools/dna.html Translates a DNA sequence in all 6 frames. Phylogeny.fr specialist http://www.phylogeny.fr/ Web-based, robust phylogenetic analysis for the non-specialist or Targeting prediction (membranes, chloroplast, mitochondrion, vacuole, etc) and targeting peptide cleavage sites: TMHMM http://www.cbs.dtu.dk/services/TMHMM/ proteins. TargetP Prediction of transmembrane helices in http://www.cbs.dtu.dk/services/TargetP/ Predotar http://urgi.versailles.inra.fr/predotar/predotar.html WoLF PSORT http://wolfpsort.org/ METABOLIC PATHWAY RESOURCES Swiss-Prot Enzyme http://ca.expasy.org/enzyme/ Enzyme nomenclature data base (linked to SWISSPROT protein database, BRENDA, KEGG, etc) BRENDA http://www.brenda-enzymes.info/ Comprehensive enzyme database. KEGG http://www.genome.jp/kegg/ The Kyoto Encyclopedia of Genes and Genomes. Includes metabolic pathways, and compound structures that can be captured. BioCyc, EcoCyc, MetaCyc, YeastCyc http://BioCyc.org/ EcoCyc - Encyclopedia of E. coli Genes and Metabolism; MetaCyc - Metabolic Encyclopedia. Also computationally-derived pathway/genome databases. AraCyc http://www.arabidopsis.org/biocyc/index.jsp Similar to BioCyc, for Arabidopsis. Software allows querying, graphical representation of pathways, and overlay of expression data on the biochemical pathway overview diagram. MaizeCyc http://www.gramene.org/pathway/maizecyc.html KEGG and the various Cyc databases have similar aims but each has features the others lack. Plant Gateway for PubSEED http://pubseed.theseed.org/seedviewer.cgi?page=PlantGateway Contains interactive pathway diagrams for plant B vitamin synthesis (Arabidopsis and maize) that include information on gene fusions. There are also deep annotations of B vitamin pathways in tabular form. Beware! All metabolic pathway databases have weaknesses: - They are not necessarily up-to-date and may have omissions and errors in their pathways – so they should be checked against the literature. - Proteins are very often (for non-model organisms, almost always) assigned functions based solely on homology - but it is not clear from the database that this is so - To reach firm conclusions it is therefore necessary to go to the literature to find whether a putative function has been authenticated biochemically or genetically PLANT GENOME RESOURCES JGI http://genome.jgi.doe.gov/ Joint Genome Institute genome portal (all kingdoms of life) Gramene http://www.gramene.org/ analysis of grasses and other plants Curated, open-source, data resource for comparative genome TAIR http://www.arabidopsis.org/ The Arabidopsis information resource. Maizesequence.org http://www.maizesequence.org/index.html and annotation of the maize genome Browser providing the latest sequence PLANT TRANSCRIPTOME RESOURCES Golm Transcriptome database http://csbdb.mpimp-golm.mpg.de/csbdb/dbxp/ath/ath_xpmgq.html Microarray data. Gives overview of expression, searches for co-responses. ATTED http://atted.jp/ Microarray data. Searches for co-expression patterns in Arabidopsis (and also rice); shows gene networks, not just lists of correlated genes. qteller http://qteller.com/ RNAseq data for maize, sorghum, rice. Simple tools for expression in various organs, correlation of expression of two genes. PLANT PROTEOME RESOURCES PPDB http://ppdb.tc.cornell.edu/ The Plant Proteome DataBase SUBA3 http://suba.plantenergy.uwa.edu.au/ SUB-cellular location database for Arabidopsis proteins (includes GFP and MS-MS data) PLANT PHENOME RESOURCES SeedGenes http://www.seedgenes.org/ Genes that give a seed phenotype when disrupted by mutation. Chloroplast2010 http://www.plastid.msu.edu/ Large set of phenotypic for homozygous mutant of chloroplast genes. RAPID http://rarge.gsc.riken.jp/phenome/ RIKEN Arabidopsis Phenome Information Database, phenotypic data in transposon-insertional mutants. COMPARATIVE GENOMICS RESOURCES STRING http://string.embl.de/ STRING is a database of known and predicted protein-protein relationships, derived from genomic context (fusions, conserved gene clusters, co-occurrence), high throughput experiments (co-expression), and the literature. STRING quantitatively integrates data from bacteria and other organisms. SEED http://www.theseed.org/wiki/Main_Page Database with 3,000+ genomes, many analysis tools. Very useful for gene cluster analysis. Browsers compatible with SEED: DOWNLOAD FIREFOX (PC or Mac) or SAFARI (PC or Mac) To request a SEED account: Go to http://rast.nmpdr.org/rast.cgi * Click ‘Register a new account’, complete the form, hit ‘Request’ button * After an automated email reply, a password will be emailed. USING METABOLIC PATHWAY RESOURCES • SWISS-PROT ENZYME Enzyme nomenclature database http://ca.expasy.org/enzyme/ ENZYME is a repository of information on enzyme nomenclature, with links to other databases. It describes enzymes that have been given an EC (Enzyme Commission) number, and the reactions they catalyze. It can be searched in various ways, e.g. by EC number, by common name, by substrate or product. Example: alcohol dehydrogenase = EC 1.1.1.1 ENZYME entry page * Links to: BRENDA (convenient entry point) KEGG (Kyoto University Ligand Chemical Database (maps – glycolysis) PDB (protein structure database) MetaCyc Medline Cloned enzymes in SwissProt (not exhaustive but curated, i.e. high quality) • BRENDA Enzyme database http://www.brenda-enzymes.info/ BRENDA is an extensively referenced enzyme data information system; it includes data on substrate specificity, physical and kinetic characteristics, inhibitors, sources, cloning, purification etc. Example: alcohol dehydrogenase EC 1.1.1.1 • KEGG Kyoto Encyclopedia of Genes and Genomes http://www.genome.jp/kegg/ KEGG computerizes knowledge of molecular and cell biology in terms of pathways that consist of interacting molecules or genes and provides links from gene catalogs produced by genome sequencing. It covers regulatory pathways and molecular assemblies as well as metabolic pathways. Its metabolic pathway maps have links to the enzymes and compounds. Example: KEGG PATHWAY * Metabolism of Cofactors and Vitamins - Folate biosynthesis * Note that all enzymes (EC numbers) and intermediates are clickable, e.g. * 3.5.4.16 and its * product (structure can be captured). Note that this is a composite metabolic scheme. It includes methanopterin biosynthesis (found only in methane-producing microbes) and tetrahydrobiopterin synthesis (found in animals). Note the pulldown table (top left) of folate biosynthesis enzymes in different organisms; when an organism is selected, the enzymes putatively encoded in its genome are colored green. Compare Arabidopsis and human. • EcoCyc, MetaCyc http://BioCyc.org/ and AraCyc EcoCyc - Encyclopedia of E. coli Genes and Metabolism: Describes the genome and biochemical machinery of E. coli. Contains annotations of all E. coli genes, and their DNA sequences, and describes all known pathways of E. coli small-molecule metabolism. Each pathway and its component reactions and enzymes have detailed annotations, and are extensively referenced. MetaCyc - Metabolic Encyclopedia: A metabolic-pathway database that describes pathways, reactions, and enzymes of various organisms, especially microbes. MetaCyc contains the E. coli pathways of EcoCyc, plus other pathways from the literature and on-line sources, with citations to the sources of pathways. Example: MetaCyc * Search tab – Pathways * Search/Filter by ontology * Biosynthesis * Amino acid biosynthesis * Superpathway of phenylalanine and tyrosine biosynthesis * Note that all elements in pathway are clickable. AraCyc (PlantCyc) at TAIR http://www.arabidopsis.org/biocyc/index.jsp * Search tab – Pathways * Search/Filter by ontology * Biosynthesis * Amino acid biosynthesis * Superpathway of Lysine/Threonine/Methionine biosynthesis * Click ‘More detail’ 2x to display genes corresponding to pathway steps * Select species. • Plant Gateway for PubSEED http://pubseed.theseed.org/seedviewer.cgi?page=PlantGateway Contains interactive pathway diagrams for B vitamin synthesis in plants (Arabidopsis and maize), which includes information on fusions. There are also deep annotations of B vitamin pathways in tabular form. Example: Click on SEED diagram for thiamin * Note pathway diagram showing compartmentation * Hovering over enzyme boxes shows enzyme names, clicking on compound circles links to KEGG * Note ‘Gene fusion events’ box * In this box, note the fusion between a TenA protein and a HAD domain of unknown function * Select Arabidopsis to color genome * Green overlay indicates enzymes for which genes are known * Clicking on these genes connects to main SEED database. ORGANELLAR TARGETING Targeting prediction Example: 10-Formyltetrahydrofolate deformylase (PurU) is an enzyme found in E. coli and many other bacteria (e.g., the cyanobacterium Nostoc) that hydrolyzes 10-formyltetrahydrofolate, releasing formate. The Arabidopsis genome encodes two homologs of E. coli PurU (At5g47435 and At4g17360). >E_coli gi|548645|sp|P37051|PURU_ECOLI FORMYLTETRAHYDROFOLATE DEFORMYLASE (FORMYL-FH(4) HYDROLASE) MHSLQRKVLRTICPDQKGLIARITNICYKHELNIVQNNEFVDHRTGRFFMRTELEGIFNDSTLLADLDSA LPEGSVRELNPAGRRRIVILVTKEAHCLGDLLMKANYGGLDVEIAAVIGNHDTLRSLVERFDIPFELVSH EGLTRNEHDQKMADAIDAYQPDYVVLAKYMRVLTPEFVARFPNKIINIHHSFLPAFIGARPYHQAYERGV KIIGATAHYVNDNLDEGPIIMQDVIHVDHTYTAEDMMRAGRDVEKNVLSRALYKVLAQRVFVYGNRTIIL >Nostoc gi|186681065|ref|YP_001864261.1| formyltetrahydrofolate deformylase [Nostoc punctiforme PCC 73102] MMTNPTATLLISCPDQRGLVAKFANFIYSNGGNIIHADQHTDFAAGLFLTRIEWQLEGFNLPREFIAPAF NAIAQPLSAKWEIRFSDTVPRIAIWVSRQDHCLFDLIWRQRAKEFVAEIPLIISNHANLKVVAEQFNIDF QHVPITKDNKSEQEAQQLELLRQYKIDLVVLAKYMQIVSADFINQFSQIINIHHSFLPAFIGANPYHRAF ERGVKIIGATAHYATADLDAGPIIEQDVVRVSHRDEVDDLVRKGKDLERVVLARAVRSHLQNRVLVYGNR TVVFE >At5g47435 gi|18422794|ref|NP_568682.1| formyltetrahydrofolate deformylase, putative [Arabidopsis thaliana] MIRRITERASGFAKNIPILKSSRFHGESLDSSVSPVLIPGVHVFHCQDAVGIVAKLSDCIAAKGGNILGY DVFVPENNNVFYSRSEFIFDPVKWPRSQVDEDFQTIAQRYGALNSVVRVPSIDPKYKIALLLSKQDHCLV EMLHKWQDGKLPVDITCVISNHERASNTHVMRFLERHGIPYHYVSTTKENKREDDILELVKDTDFLVLAR YMQILSGNFLKGYGKDVINIHHGLLPSFKGGYPAKQAFDAGVKLIGATSHFVTEELDSGPIIEQMVESVS HRDNLRSFVQKSEDLEKKCLTRAIKSYCELRVLPYGTNKTVVF >At4g17360 gi|15236046|ref|NP_193467.1| formyltetrahydrofolate deformylase, putative [Arabidopsis thaliana] MIRRVSTTSCLSATAFRSFTKWSFKSSQFHGESLDSSVSPLLIPGFHVFHCPDVVGIVAKLSDCIAAKGG NILGYDVLVPENKNVFYSRSEFIFDPVKWPRRQMDEDFQTIAQKFSALSSVVRVPSLDPKYKIALLLSKQ DHCLVEMLHKWQDGKLPVDITCVISNHERAPNTHVMRFLQRHGISYHYLPTTDQNKIEEEILELVKGTDF LVLARYMQLLSGNFLKGYGKDVINIHHGLLPSFKGRNPVKQAFDAGVKLIGATTHFVTEELDSGPIIEQM VERVSHRDNLRSFVQKSEDLEKKCLMKAIKSYCELRVLPYGTQRTVVF Targeting predictions for the At5g47435 and At4g17360 proteins using: TargetP: http://www.cbs.dtu.dk/services/TargetP/ Paste in both Arabidopsis sequences * Check ‘Plant’, ‘Perform cleavage site predictions’ Predotar: http://urgi.versailles.inra.fr/predotar/predotar.html Paste in both Arabidopsis sequences The prediction algorithms agree that both proteins are mitochondrial. To check this, align them with the bacterial PurU sequences using Multalin http://bioinfo.genotoul.fr/multalin/multalin.html * Alignment shows that both Arabidopsis proteins have N-terminal extensions of ~35 residues (a typical size for a mitochondrial targeting peptide). * Align just the two Arabidopsis sequences – note that the N-terminal extensions are not conserved (typical of targeting sequences). Targeting – proteome databases with experimental findings PPDB http://ppdb.tc.cornell.edu/ Click on ‘Accession’ * Paste AGI number(s) in box, e.g. At1g03475 (Coproporphyrinogen III oxidase) * Click on link(s) * Displays proteomic evidence in database and published. SUBA3 http://suba.plantenergy.uwa.edu.au/ Click Search tab * Paste AGI number(s) in lower box, e.g. At1g03475 * Click + ‘Arabidopsis Gene Initiative (AGI) identifier(s) is in list’, then ‘Query’ * Displays evidence. PHYLOGENETIC TREES Using Phylogeny.fr: Select ‘One Click’ mode * Paste in the four PurU sequences above * Click ‘Submit’ * Carries out in sequence alignment with MUSCLE, Maximum Likelihood tree-building with PhyML, and treedrawing with TreeDyn. The analysis runs the aLRT statistical test, which gives results similar to the bootstrap procedures but is much faster. * Download the tree in preferred image format. ‘Advanced’ and ‘A la carte’ modes are available for experienced users. MICROARRAY DATABASES Microarrays: The 22K Affymetrix chip contains most Arabidopsis genes, so in principle it can be used to monitor the expression of almost all metabolic genes. However, many metabolic genes have low expression levels, and so cannot be monitored with confidence. Genes with low average expression levels tend to give large numbers of spurious co-expression matches. mRNA abundance in general correlates broadly with protein abundance and with in-vivo metabolic fluxes. Therefore digital gene expression data can indicate which organs have a pathway and which do not, and whether a pathway is likely to be a major or minor one. Note also that primary metabolic pathways are expressed everywhere and always, and that secondary pathways by definition are not. Unexpected differences in expression may provide clues about genetic control of pathways, e.g. an enzyme whose transcript level varies more than that of others in the pathway (i.e. is highly regulated, not constitutive) may be a control point in the pathway. Microarray-based gene expression profiling using the Golm Transcriptome database http://csbdb.mpimp-golm.mpg.de/csbdb/dbxp/ath/ath_xpmgq.html For an overview of expression in different organs and in different environmental conditions: On face page, paste in one or several AGI numbers e.g. At3g12930 At5g47190 At2g39800 (At3g12930 is the plastid Iojap protein; At5g47190 is chloroplast ribosomal protein L19; At2g39800 is the first enzyme of proline biosynthesis) * Scroll down to graphs. Note positive correlation between At3g12930 and At5g47190. Note induction of At2g39800 by stresses. To search for positively correlated genes, go to Transcript Co-Response, Single Gene Query, paste in At5g47190 * Select a dataset (‘Matrix’), e.g. developmental series * select an output, e.g. positive, top 100 of co-responding genes * Scroll down list of hits – note many strong correlations with other chloroplast ribosomal proteins, which associate together to form the protein complexes of the ribosome Microarray-based gene expression profiling using ATTED http://atted.jp/ On face page Search box select ‘Gene ID’, paste an AGI number, e.g. At5g47190 in box, click ‘Search’ * Click on link (‘Target’ box summarizes targeting predictions) * Displays coexpressed gene network around At5g47190 * Note the many proteins related to plastid ribosomes * Click on coexpressed gene list for more coexpressed genes * Check all 4 boxes * Default ranking is by all datasets * Rankings in individual datasets (e.g. tissue type, abiotic stress) can also be displayed * In ‘Link’ column, graph icon displays correlation data points * Osa homolog column shows putative rice ortholog, clicking on link displays correlation list for rice genes * Note many ribosome associations of rice homolog of best Arabidopsis hit. RNAseq-based gene expression profiling using qteller http://qteller.com/ Select maize, paste in GRMZM2G161299 or GRMZM2G420119 * Note expression profile by organ * Use these two genes for correlation analysis. PLANT PHENOME DATABASES Although less developed than phenotype databases for mutants in model microorganisms, there are several such resources for plants, and they are growing. SeedGenes http://www.seedgenes.org/ Covers ~350 Arabidopsis genes that give a seed phenotype when disrupted by mutation. Click ‘Enter’, click ‘Access the SeedGenes Query Page’, ‘Browse genes’, search for AGI numbers in list. Chloroplast2010 http://www.plastid.msu.edu/ Has morphological and metabolic phenotype data for >5,000 mutants in genes whose products are predicted to be chloroplast-targeted. In ‘Large scale phenomics data’ click ‘Here’ * In ‘Phenotypic Analysis Overvew’ click ‘Here’ * Log in or sign up to get an account * In Search by Query Term(s) area, search by AGI number, e.g. At4g25050, At1g10310 (be sure to avoid blank spaces or empty lines) * Click on links to genes * See tabs for morphology, leaf amino acid profile, etc RAPID http://rarge.gsc.riken.jp/phenome/ Phenotypic data in transposon-insertional mutants. Click ‘Line list’ * search for AGI number, e.g. At2g48120 * copy line code 11-2389-1 * Click on ‘Search’ * Paste line code into search box * Click ‘Search’ * displays image of albino seedling USING GENOME RESOURCES TO FIND PLANT ENZYME GENES This exercise demonstrates how to find Arabidopsis and maize genes encoding an enzyme, starting from the sequence of a bacterial enzyme, 5,10-methylenetetrahydrofolate reductase, EC 1.5.1.20 (MetF). Go to Swiss-Prot Enzyme, enter 1.5.1.20 * Click on link to E. coli MetF * Capture FASTA sequence * Go to NCBI Protein BLAST search * Select Arabidopsis thaliana * Hits on MTHFR1 and MTHFR2 (At3g59970 and At2g44160) Note multiple entries for each gene * Capture full-length (about 590 residues) FASTA text sequences, save to Word file * Align in Multalin to confirm their very high similarity. To maize homologs, go to Maizesequence.org, click ‘BLAST’ in header bar * Paste either Arabidopsis sequence in search box * Select ‘peptide queries’, ‘peptide database’, ‘Filtered gene set peptides’, ‘BLASTP’, search sensitivity ‘no optimization’, click ‘Run’ * In output, if necessary turn on all columns, select ‘E-val’ in Stats and <E-val in Sort By * To see alignments, click [A] * Very strong hit, GRMZM2G347056 (593 residues) on chromosome 1; also second hit, truncated (382 residues), GRMZM2G034278 on chromosome 5. (Third hit is a small fragment) * To capture protein sequences, click on GRMZM identifiers, ‘Protein sequence’, save to Word file. Sequence alignment indicates that GRMZM2G034278 is distinct from GRMZM2G347056 and both Arabidopsis sequences in lacking ~200 residues at the C-terminus, in having a very different N-terminal region of ~80 residues. GRMZM2G034278 is thus almost certainly an incorrectly-called gene or a pseudogene. To check whether these genes are expressed, use GRMZM2G034278 and GRMZM2G347056 protein sequences in tBLASTn against maize ESTs: - 50 exactly match GRMZM2G082463 (allowing for imperfections characteristic of EST sequences) None appear to exactly match GRMZM2G034278 Therefore, since the predicted GRMZM2G034278 protein is truncated, and has no cognate ESTs (i.e. is not transcribed), it is most probably a pseudogene. Note that ~85% of the maize genome consists of hundreds of families of transposable elements. These are responsible for capture and amplification of many gene fragments. >MetF 5,10-methylenetetrahydrofolate reductase [Escherichia coli str. K-12 substr. MG1655] MSFFHASQRDALNQSLAEVQGQINVSFEFFPPRTSEMEQTLWNSIDRLSSLKPKFVSVTYGANSGERDRT HSIIKGIKDRTGLEAAPHLTCIDATPDELRTIARDYWNNGIRHIVALRGDLPPGSGKPEMYASDLVTLLK EVADFDISVAAYPEVHPEAKSAQADLLNLKRKVDAGANRAITQFFFDVESYLRFRDRCVSAGIDVEIIPG ILPVSNFKQAKKFADMTNVRIPAWMAQMFDGLDDDAETRKLVGANIAMDMVKILSREGVKDFHFYTLNRA EMSYAICHTLGVRPGL >MTHFR1 gi|15232215|ref|NP_191556.1| methylenetetrahydrofolate reductase 1 [Arabidopsis thaliana] MKVVDKIKSVTEQGQTAFSFEFFPPKTEDGVENLFERMDRLVSYGPTFCDITWGAGGSTADLTLEIASRM QNVICVETMMHLTCTNMPIEKIDHALETIRSNGIQNVLALRGDPPHGQDKFVQVEGGFACALDLVNHIRS KYGDYFGITVAGYPEAHPDVIEADGLATPESYQSDLAYLKKKVDAGADLIVTQLFYDTDIFLKFVNDCRQ IGINCPIVPGIMPISNYKGFLRMAGFCKTKIPAELTAALEPIKDNDEAVKAYGIHFATEMCKKILAHGIT SLHLYTLNVDKSAIGILMNLGLIDESKISRSLPWRRPANVFRTKEDVRPIFWANRPKSYISRTKGWNDFP HGRWGDSHSAAYSTLSDYQFARPKGRDKKLQQEWVVPLKSIEDVQEKFKELCIGNLKSSPWSELDGLQPE TKIINEQLGKINSNGFLTINSQPSVNAAKSDSPAIGWGGPGGYVYQKAYLEFFCSKDKLDTLVEKSKAFP SITYMAVNKSENWVSNTGESDVNAVTWGVFPAKEVIQPTIVDPASFKVWKDEAFEIWSRSWANLYPEDDP SRKLLEEVKNSYYLVSLVDNNYINGDIFSVFA >MTHFR2 gi|18406468|ref|NP_566011.1| methylenetetrahydrofolate reductase 2 [Arabidopsis thaliana] MKVIDKIQSLADEGKTAFSFEFFPPKTEDGVDNLFERMDRMVAYGPTFCDITWGAGGSTADLTLDIASRM QNVVCVESMMHLTCTNMPVEKIDHALETIRSNGIQNVLALRGDPPHGQDKFVQVEGGFDCALDLVNHIRS KYGDYFGITVAGYPEAHPDVIGENGLASNEAYQSDLEYLKKKIDAGADLIVTQLFYDTDIFLKFVNDCRQ IGISCPIVPGIMPINNYRGFLRMTGFCKTKIPVEVMAALEPIKDNEEAVKAYGIHLGTEMCKKMLAHGVK SLHLYTLNMEKSALAILMNLGMIDESKISRSLPWRRPANVFRTKEDVRPIFWANRPKSYISRTKGWEDFP QGRWGDSRSASYGALSDHQFSRPRARDKKLQQEWVVPLKSVEDIQEKFKELCLGNLKSSPWSELDGLQPE TRIINEQLIKVNSKGFLTINSQPSVNAERSDSPTVGWGGPVGYVYQKAYLEFFCSKEKLDAVVEKCKALP SITYMAVNKGEQWVSNTAQADVNAVTWGVFPAKEIIQPTIVDPASFNVWKDEAFETWSRSWANLYPEADP SRNLLEEVKNSYYLVSLVENDYINGDIFAVFADL >GRMZM2G347056 MKVIEKILEAAGDGRTAFSFEYFPPKTEEGVENLFERMDRMVAHGPSFCDITWGAGGSTA DLTLEIANRMQNMVCVETMMHLTCTNMPVEKIDHALETIKSNGIQNVLALRGDPPHGQDK FVQVEGGFACALDLVQHIRAKYGDYFGITVAGYPEAHPDAIQGEGGATLEAYSNDLAYLK RKVDAGADLIVTQLFYDTDIFLKFVNDCRQIGITCPIVPGIMPINNYKGFLRMTGFCKTK IPSEITAALDPIKDNEEAVRQYGIHLGTEMCKKILATGIKTLHLYTLNMDKSAIGILMNL GLIEESKVSRPLPWRPATNVFRVKEDVRPIFWANRPKSYLKRTLGWDQYPHGRWGDSRNP SYGALTDHQFTRPRGRGKKLQEEWAVPLKSVEDISERFTNFCQGKLTSSPWSELDGLQPE TKIIDDQLVNINQKGFLTINSQPAVNGEKSDSPTVGWGGPGGYVYQKAYLEFFCAKEKLD QLIEKIKAFPSLTYIAVNKDGETFSNISPNAVNAVTWGVFPGKEIIQPTVVDHASFMVWK DEAFEIWTRGWGCMFPEGDSSRELLEKVQKTYYLVSLVDNDYVQGDLFAAFKI >GRMZM2G034278 MCMLLRKDSGHYLAIVVYVKCCSLEEERRKERIPTELMRSFILTSHTAPGRAPAASSICN DRTRRRAELLSSYIYNSSTKVCVETMMHLTCTNMPVEKIDHALETIKFNGIHNVLALRGD PPHGQDKFVQVEGGFACALDLVQHIRSKYGDYFGITVAGYPEAHPDAIQGEGGATLEAYS NDLAYLKRKVDAGADLIVTQLFYDTDIFLKFVNDCRQIGITCPIVPGIMPINNYKGFMRM TGFCKTKIPSEITAALDPIKDNEEAVRAYGIHLGTEMCKKIIASGIKTLHLYTLNVDKSA LGILMNLGLIEESKVSRSLPWRPATNVFRVKEVVRPIFWASRPKSYLKRTLGWDQYPHEG GVILETHHMEHLGIVHKTTWTW

METABOLIC PATHWAYS & GENOMICS File: genome&p

Related documents

Products

Support

METABOLIC PATHWAYS & GENOMICS File: genome&p

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib