EDITS BY DAVID HANCOCK 16/11/2006 Replies by Oliver Jones 16/11/2006 CIMR: Environmental Analysis Context QUESTION: what is CIMR? ANSWER: CIMR stands for Core Information for Metabolomics Reporting. The title page I used is is the same one as for all standards Metabolomics Standards Initiative (MSI) Sponsor: Reference: Version: Date: Authors: Metabolomics Society (http://www.metabolomicssociety.org/) http://msi-workgroups.sourceforge.net/bio-metadata/reporting/env/ 1.1 15/11/2006 Dan Bearden, NOAA, Charleston SC, USA (dan.bearden@noaa.gov) Jake Bundy, Imperial College, UK (j.bundy@imperial.ac.uk) Tim Collette EPA, Athens GA, USA (collette.tim@epamail.epa.gov) Felicity Currie, University of Manchester, UK, (felicity.currie@postgrad.manchester.ac.uk) Matt Davey, University of Sheffield, UK (m.davey@sheffield.ac.uk) Dawn Field, EBI, UK (dfield@ceh.ac.uk) David Hancock, University of Manchester and EBC, UK (hancockd@cs.man.ac.uk) Oliver Jones, University of Cambridge, UK (oahj2@mole.bio.cam.ac.uk) Norman Morrison, University of Manchester, UK (morrison@cs.man.ac.uk) Simone Rochfort, DPI, Werribee, Australia (simone.rochfort@dpi.vic.gov.au) Susanna-Assunta Sansone, EBI, Cambridge, UK (sansone@ebi.ac.uk) Dalibor Stys, Jihočeská univerzita v České Budějovice, Czech Republic (stys@jcu.cz) Quincy Teng, EPA, Athens GA, USA (Teng.Quincy@epa.gov) Mark Viant, University of Birmingham, UK (m.viant@bham.ac.uk) COMMENT: what ordering has been used to arrange the author list ?– originally Norman had it in alphabetical order which seems fair [I just put them in the order they appeared in my inbox, starting from the person to reply to the Marks first e-mail about it after the Boston meeting. I put Norman at the top as he is the chair. Alphabetical order works as well though. I don’t mind which is used] Edited by Norman Morrison and Oliver Jones Credits: Chair: Norman Morrison Copyright © 2006 MSI - Metabolomics Society and contributing authors Table of Contents 1.0. This Document .................................................................................................... 2 2.0. Goals of the Environmental Context Working Sub-Group................................. 2 3.0. Required Information .......................................................................................... 4 4.0. Minimum Reporting Standards for Environmental Metabolomic Studies ......... 5 4.1. Sample Type ................................................................................................... 5 4.2. Site Parameters................................................................................................ 6 1 CIMR-Environmental 4.2.1. Parameters Common to all Study Types .................................................... 6 4.2.2. Additional Parameters for Aquatic Samples .............................................. 7 4.2.3. Additional Parameters for Terrestrial Samples .......................................... 7 4.3. Details of Specimens....................................................................................... 8 5.0. Additional Information ....................................................................................... 8 6.0. Bibliography ....................................................................................................... 9 1.0. This Document This document forms part of the standards for reporting metabolomics experiments developed under the Metabolomics Society (http://www.metabolomicssociety.org/ Metabolomics Standards Initiative (MSI). It should be read in the context of top level document for those standards. The current version of the document is work in progress. The scope of our effort is to identify, develop and disseminate a core set of reporting requirements [I would argue that ‘descriptors’ are not the same as ‘reporting requirements’ – descriptors are formally defined terms that can be used when reporting something] [Maybe we could remove the word descriptors here?] necessary for the consistent annotation of environmental metabolomic experiments. This working group considers environmental metabolomics as the application of metabolomics to the fields of environmental biology and toxicology [do we need to make a distinction between the two ?]. [Yes I think we do, they are not the same thing]. The aim is not to be prescriptive in how one should perform a metabolomic experiments, but to formulate a minimum set of reporting requirements that describe said experiments (what they are, what material were used, and how they were actually executed). Consequently, there will be no attempt to restrict or dictate specific practices, but to develop consistent and appropriate descriptors [see above] to support the dissemination and re-use of metabolomic data. Such reporting standards will specify the data identified as necessary for adequate and correct reporting in a range of identified contexts, such as submission to academic journals and public databases. The group also recommends that data exchange standards be developed in order to provide a transparent technical vehicle which meets or exceeds the requirements of reporting standards. It is also stressed that the requirements below are merely a minimal list and so not exhaustive. COMMENT: I think we need to be more consistent in our use of the terms: descriptors, standards and requirements. I suggest that there is a difference between semi-formal (i.e. natural language) descriptions of what should be reported about a study, and formal definitions of terms and compositions of those terms. [I agree perhaps we could decide on this as a group] 2.0. Goals of the Environmental Context Working Sub-Group. The ECWSG seeks to: 2 CIMR-Environmental I. Work cooperatively on a consensus document for a minimum set of reporting requirements needed to enable the interpretation of the results of the metabolomics based experiment generated in an environmental context unambiguously and potentially to reproduce the experiment. to evaluate, understand, reproduce, compare and re-investigate [what does this mean? Do we mean discover new information, or validate existing information] metabolomic data generated in an environmental context. [To me this means validate existing information, for instance, if you are reviewing a paper or want to repeat the work.] II. Represent the views of community members working in the area of environmental biology [see point about toxicology above] [see answer about toxicology above] using metabolomics approaches in an unbiased and open fashion. The group will engage with environmental metabolomics practitioners through an iterative process of use case provision and validation of selected descriptors [see previous comments about requirements,standards and descriptors]. [see answer about requirements etc. above] III. To evaluate previous and relevant standardisation efforts in environmental toxicology and biology, including similar work in genomics, transcriptomics and proteomics studies. IV. Interact with the MICheck (micheck.sf.net) project as well as with individual technology specific groups (i.e. MGED, HUPO-PSI, GSC). The group recognises that the larger scientific community will best be served if the resulting specification overcomes duplications across omics domains where commonality of the concepts exists. V. Evaluate our reporting requirements in the context of the other biological context working groups with an aim to identify any overlaps / gaps in our reporting. The ECWSG identifies four possible scenarios for environmental investigations: 1. Field Trials - Investigations in which a sample is taken directly from the field 2. Field to Lab - Investigations in which a sample is taken from the field and conditioned in the lab under controlled conditions 3. Lab to Field - Investigations in which a sample is taken from the lab and conditioned in the field under semi-controlled conditions. 4. Lab Investigation - Investigations in which a sample is lab reared or obtained from a standard provider and conditioned in the lab under controlled conditions. COMMENT: Is this list exhaustive? For instance “field to lab to field” or “lab to field to lab” scenarios? I wonder whether we need to worry about this; but instead provide templates for describing: “field info”, “lab info”, “capture info”, “release info” with which any permutation of lab and field could be represented? [REPLY: This is what we came up with and discussed quite extensively at the teleconference, but I think either would be OK.] 3 CIMR-Environmental Furthermore, the ‘environment’ of an organism(s) described as part of an investigation using metabolomic technologies falls into three broad categories: A. Uncontrolled Environment. The environment of the organism has not been artificially modified for the purposes of the investigation (e.g. point 1 above). B. Semi-controlled Environment. The environment of the organism has been partially artificially controlled for the purposes of the investigation (see points 2 & 3 above). C. Controlled Environment. The environment of the organism has been fully artificially controlled or recreated for the purposes of the investigation (see point 4 above). *Biotic Environment. We recognise [I actually liked this part better how it was originally but I don’t think it makes that much difference] that an organism can have a ‘biotic environment’, where the environment of an organism is another organism (i.e. in a parasitic relationship). This scenario necessitates a nested description of the environment, including: a description of the biotic environment of the primary organism and that of the host. This ‘use case’ scenario is considered less common in metabolomic experiments than in other ‘omic technologies. 3.0. Required Information The minimal information set for reporting the samples in environmental metabolomic experiments builds upon the general biological practice of reporting biological experiments in scientific journals in a way that the ‘materials and methods’ section should include sufficient information to allow the experiments to be repeated' ('Instructions to authors, ASM' - http://www.journals.asm.org/). These include aspects such as: Species/strains/bioresource Source of the strains Experimental design COMMENT: Are not the last three items in this list actually part of ‘experimental design’? [I think you are right here and have changed it accordingly.] These aspects are considered to be general aspects that are reported in every biological scientific paper/experiment, and are not addressed by this subgroup. However, scientific journals also ask that authors describe those conditions that are critical to the experiments performed in larger detail. Below we have identified those factors that are critical to environmental metabolomic experiments, for which the materials and methods used [not just the methods, but the materials], [I agree and have changed this] should be described in larger detail. The challenge therefore, is not to capture everything, but to provide a framework where the environmental features considered relevant to a particular sample – which will be highly context dependant – can be captured in a structured and extensible manner. Metabolite levels can be perturbed by many different factors which may be unrelated to the study being undertaken. Consequently these factors must therefore be taken into account when analysing the resulting metabolomics data. Therefore please report the data 4 CIMR-Environmental requirements outlined below as accurately as possible. All information outlined below should be considered as required (if applicable) unless marked in italics, in which case it is recommended further information but it is not automatically required. [I don’t understand this bit – are we saying that external stimuli need to be recorded?] [No, not at all; just that they must be taken into account by metabolite levels can be perturbed by many different factors which may well be unrelated to the study. They must therefore be taken into account when analysing the results] 4.0. Minimum Reporting Standards for Environmental Metabolomic Studies Sources: The International Council for the Exploration of the Sea (ICES) Integrated Environmental Reporting Format document [1] Data reporting requirements from the PETRAEA project [2] Plant Information Management System (PIMS) Database Schema [3] Protocols for Standard Measurements at Freshwater Sites [4] [5] and [6] 4.1. Sample Type Is the sample Environmental (e.g. terrestrial or aquatic)[ I agree with Dawn, this list should be longer otherwise the reader thinks we have divided the world into just two categories] [Well essentially it is divided into three very, very general categories, terrestrial, aquatic and airborne. This step was just to start the process off. We could add extra categories (such as arboreal or airborne) here but will anybody actually be sampling from the latter and don’t forget we also have a more specific section on environment/habitat type in section 4.2.1.] or Laboratory based (e.g. cage, biosphere, aquarium, agar broth (see points 2, 3, 4 and B and C in section 2.0) COMMENT: what about the scenarios discussed in Section 2? I am not convinced that dividing the sample this way adds any value. I prefer the original proposal that the environment of sample be described, whether that environment be wild (natural) or artificial. Many (if not most) of the same factors that might be of interest irrespective of the type of environment that the sample came from. 5 CIMR-Environmental [Reply: I strongly disagree here I think it adds a lot of value. It is also not new. This was discussed extensively on the e-mail correspondence where I thought there was a fairly clear consensus that a ‘one size fits all approach’ was not the way forward. There are clearly a lot of factors relevant to terrestrial samples, which are not relevant at all to aquatic samples and vive versa. I can’t actually think of any more environment types that don’t fall into either of these two general categories. For instance, rivers, lakes and even pore water come under aquatic, while soil, desert, polar, urban, fields and woods etc. come under terrestrial. Remember, the general idea behind the set up was that a researcher submitting a paper online would start by saying of the sample was aquatic or terrestrial and depending on the answer would be given the specific parameters for each one automatically.] 4.2. Site Parameters 4.2.1. Parameters Common to all Study Types Environment/habitat type, e.g. marine, freshwater, river, lake, ocean, sediment, sediment pore water, sediment, field, intertidal zone, soil, grassland, desert, polar, urban etc. (see points 1,2,3 and A and B in section 2.0) COMMENT: these environment types might also be artificially recreated in the lab to the extent that the same cannot discern the difference. Also, this list doesn’t cover arboreal or airborne samples. [The list is not meant to be exhaustive. I take your point about recreating the environment in the lab but I still think that scenario would come more under the in vivo group’s standards that have already been developed]. 6 Geographic location (latitude and longitude in degrees/minutes/decimal minutes) National Grid Reference Sampling dates and/or times if a time course experiment is being run. ( Perhaps these should be collapsed into a single requirement, as it stands we don’t define what we mean by time, to me this arrangement implies that sampling occurred between two time points on a single day) [I agree and have changed it] Supplier name and details if organisms are obtained from a standard provider (what about the lab->field scenario) [I agree and have changed this section] Concentration of any pollutants known (e.g. heavy metals). If samples are dosed please provide details of the compound, exposure route, dose, dose volume [is this different to dose? And how does it relate to concentration, e.g. we might know the amount of mercury in the soil but not know how much the earthworms are eating], duration of dosing, dosage vehicle (perhaps remove the ‘relevant to the study’ clause, who is to say what pollutant is relevant?) [Good point. the section is now changed] Time elapsed since an identifiable event, i.e. time elapsed since dose of environmental toxicant (this should be part of the above parameter) [I am not sure about this the identifiable event does not have to be dosing of a pollutant, it could be since the sun set or the tide went out] Growth conditions (perhaps this is part of the [? – if you mean part of weather then I don’t agree since growth conditions could also include things like nutrient levels etc.] Weather ( e.g. raining, snowing, sunny) CIMR-Environmental Lunar phase I wonder whether these parameters need rearranging into more than one category, this seems to be somewhat of a melange of environmental information, organism information and experimental information. [Well since these parameters were designed to apply to all study types it is perhaps not surprising that they are a bit of a mix. We should also stress that whatever list we generate is not exhaustive, merely a minimal list of things. [I agree and have put this at the end of section 1.0] 4.2.2. Additional Parameters for Aquatic Samples Temperature (air and water) Water depth Specimen was submerged/emerged Water salinity Water pH Dissolved and or total organic carbon level Substrate type and organic carbon content (if using benthic organisms) Tidal phase (if applicable) Relevant water quality criteria. Minimum to include following values, suspended solids, five day biochemical oxygen demand (BOD5), nitrate (including ammoniacal nitrogen) and phosphate levels. Oxygen content (DOC) and possibly other ‘standard’ measures of water quality? [I agree but I don’t think we should use DOC for dissolved oxygen, since it also stands for dissolved organic carbon? I think Biochemical Oxygen Demand (BOD5) might be more useful here as well as others now listed above] It is perhaps somewhat restrictive to introduce separate requirements for ‘aquatic’ and ‘terrestrial’ environments. I have no doubt that experimentalist will be able to find that their particular set-up will fall between any potential cracks that we leave. For example, does the shoreline count as aquatic or terrestrial? [Again I was under the impression we had already agreed this on the e-mail discussions. I do not think it is restrictive. Whether the shoreline counts as aquatic or terrestrial would depend on the tidal phase (which we have already required) and also where on the shoreline took the sample from on the shore. I think all this can already be specified from the standard.] 4.2.3. Additional Parameters for Terrestrial Samples Temperature (air and soil) Altitude Inclination Aspect (does this refer to direction relative to the sun?) [I took this from the PETRAEA standard. I believe it refers to the direction and angle to which a particular mountain or hill faces http://en.wikipedia.org/wiki/Aspect_%28geography%29]. Perhaps could clear this up]. e.g. Matt Substrate type 7 CIMR-Environmental Substrate organic carbon content and pH (why specify carbon in particular?) [The organic carbon level is standard required parameter for environmental studies because it will affect how any pollutants move without the system.] Collection conditions e.g. humidity, precipitation, wind speed (Beaufort scale) and direction [good point, this is now changed] Image data, for example links and references to photographs of plant samples taken in the field during collection may also sometimes necessary for detailed identification of individual organisms. For example, time course studies on wild plant populations. 4.3. Details of Specimens Sample identification Genus/species of specimen Common name of specimen Weight of specimen Age of specimen Sex of specimen Stage of development Ecotype Genotype Condition of specimen Method of preservation Method of shipping from environment to laboratory Method of storage in laboratory Acclimation procedure if specimens are moved from the field to the laboratory or vice versa COMMENTS: It is not clear to me what the relationship between sample and specimen is. [I agree this could be unclear. To me the sample type refers to the area the specimen was gathered in while the specimen is the organism itself. I’m not sure how to change things to make it clearer yet though]. Shouldn’t information about replication and pooling be recorded in the description of the experimental processes? Otherwise we are mixing up environmental reporting with other aspects of the experiment. [Hmm, tricky but I think you are probably right here and I have changed it accordingly] Further to my comments above about the lab/field issues, I would suggest that this list is decomposed into several sections; information about the material (independent of its location, namely the name and phenotypic data) and information about how it was gathered and prepared for removal to the lab, or vice versa. [I agree. The document is work in progress and once we have all agree on exactly what needs reporting we can focus on how best to display the requirements.] 5.0. Additional Information 8 CIMR-Environmental [Please be aware that there are additional requirements necessary for the processing of samples (e.g. extraction methods and analytical technologies used). These are detailed on the MSI webage at http://msi-workgroups.sourceforge.net/.] . [defined where?]. [This is not defined yet I suggest we just say something like ‘as detailed on the MSI webpage at http://msi-workgroups.sourceforge.net/ as above] Perhaps replace this list with a more general paragraph emphasising the need for data to also comply with reporting standards from relevant other groups, citing examples. [I agree and have changed it] 6.0. Bibliography 1. The International Council for the Exploration of the Sea (ICES) Integrated Environmental Reporting Format document (available at http://www.ices.dk/env/repfor/ERF322.doc). 2. Data reporting requirements (http://www.petraea.shef.ac.uk/) from the PETRAEA project 3. Plant Information Management System (PIMS) Database Schema (available at http://www.bioinf.shef.ac.uk/pims/images/FieldDB_new.png) 4. Sykes, J.M., Lane, A.M.J. and George, D.G. (Editors), 1999. Protocols for Standard Measurements at Freshwater Sites. The United Kingdom Environmental Change Network. Institute for Terrestrial Ecology, Grange-over-Sands, UK, 134 pp. 5. Cech, T.R., Sharing Publication-Related Data and Materials: Responsibilities of Authorship in the Life Sciences. Available at www.nap.edu/books/0309088593/html 6. A. Brazma, P Hingamp, J Quackenbush, G Sherlock, P Spellman, C Stoeckert, J Aach, W Ansorge, C A Ball, H C Causton, T Gaasterland, P Glenisson, F C P Holstege, I F Kim, V Markowitz, J C Matese, H Parkinson, A Robinson, U Sarkans, S Schulze-Kremer, J Stewart, R Taylor, J Vilo & M Vingron. Minimum information about a microarray experiment (MIAME)—toward standards for microarray data, Nature Genetics, vol 29 (December 2001), pp 365 - 371. 9