Preparing a data table: Not all that complicated !!! By : The CEF professionals 2007 Photo : web Photo : André Gagné Photo : web Why this presentation? • To avoid the enormous waste of time often associated with repeated but avoidable data handling. • To facilitate ready understanding of the data set, even many years after it was first created!!! ? Data sheets (1/2) • • • • • Write legibly. Anticipate who might make use of the data. Include the date. Organise by line… Anticipate any contentious cases & try to work around these (% closing of forest cover ou opening ?!) Data sheets (2/2) Localisation : chiffre site Fiche placette centrale 400m² Date: Equipiers Coordonnées Peuplement Azimut Altitude % fermeture au centre: Exposition Pente Inventaire écoforestier: Densité: Pente: Hauteur: Exposition: Perturbation: Drainage: Situation topo: Type d'humus: Prof. Humus: Texture sol: Type dépôt : Prof. Dépôt: Remarque : Arbres sur pied (vivants ou morts)(dhp>9 cm) No DHP (2 État (9 Espèce cm) classes) No DHP (2 État (9 Espèce No cm) classes) État (9 DHP (2 Espèce classes cm) ) The Data Table Names of variables (1/5) • Keep as simple & short as possible, but still comprehensible! • One variable = one column • Several times for the same variable = several columns • Break up complex categories: e.g., irrigated & fertilised balsam fir in block 2 To break up the indicators: fir tree irrigated and fertilized in block 2 sujet a temps1 10.2 temps2 14.3 temps3 18.7 bloc espece 2 sab fertilisation irrigation traitement oui oui 2-sab-fer-irr The Data Table Names of variables (2/5) • Place class or categorical variables at the beginning of the table, followed by the response variables. • Whether using capital or small letters, be consistent, as certain programs are case-sensitive (R)!!! • Things to avoid : – Special characters (/$%?&) – Spaces Be clear and precise, and include a comment sheet where all abbreviations are explained OR If you are in Excel, use «Insertion / Comments» Names of variables (3/5) latérale Names of variables (4/5) • The «remarks» column in a data sheet = imparts flexibility to a data table by allowing: –inclusion in an analysis or –exclusion from an analysis CHECK YOUR DATA Names of variables (5/5) The Golden Rule One Line = One Sample • Avoid the pain of having to start the process all over, or having to make more or less complex data manipulations. Date 502 502 502 502 502 502 502 502 502 502 515 515 Point 1 1 1 1 1 1 1 1 1 1 2 2 Parcelle Echelle haut midh midb bas A 10 50 40 20 70 B 10 0 0 10 10 C 10 0 0 10 10 D 10 40 50 40 50 I 25 0 70 100 100 J 25 0 0 25 70 K 25 95 90 90 60 L 50 50 60 70 50 M 50 60 50 80 70 N 50 95 85 95 100 A 10 20 50 40 100 B 10 0 5 15 75 Original data (1/6) • Before all modifications, save a copy of the original table & make modifications to a copy that you can regularly save or back up. Example: terrain2005ori.xls terrain2005_05122006.xls terrain2005_06122006.xls etc. Original data (2/6) • In your Excel workbook, the sheet farthest to the left should contain ONLY the original data in a CONTINUOUS table Date Point Parcelle Echelle haut midh midb bas 502 1A 10 50 40 20 70 502 1B 10 0 0 10 10 502 1C 10 0 0 10 10 502 1D 10 40 50 40 50 502 1I 25 0 70 100 100 Date Point Parcelle Echelle haut midh midb bas 502 1A 10 50 40 20 70 502 1B 10 0 0 10 10 502 1C 10 0 0 10 10 502 1D 10 40 50 40 50 502 1I 25 0 70 100 100 502 1J 25 0 0 25 70 Original data (3/6) • NEVER leave cells empty bloc couleur valeur 1 rouge 12 bloc couleur valeur 1 rouge 12 vert bleu 2 rouge 18 44 15 1 vert 1 bleu 2 rouge 18 44 15 vert bleu 16 14 2 vert 2 bleu 16 14 Original data (4/6) • Missing data should be indicated by « . » • Careful, always differentiate missing values from actual zero (0) values!!! bloc couleur valeur 1 rouge 12 1 vert . 1 2 2 2 bleu rouge vert bleu 44 0 16 14 Original data (5/6) • Be consistent with your codes : ABBA ≠ abba ≠ Abba • Coding of numerical data with « . » or « , » will depend on what computer or operating system you use? If you aren’t sure…utilise the decimal key in the calculator section of the keyboard bloc couleur valeur 1 Rouge 12 1 1 2 2 vert bleu rouge vert 2 bleu . 44 0 16 14 Original data (6/6) • Include cases where species Espèce have not been observed A B (NB in ecology!!!) Site Abondance 1 2 1 5 C 1 0 Espèce Site Abondance A 2 4 A 1 2 B 2 0 B 1 5 C 2 3 A 2 4 A 3 0 C 2 3 B 3 5 B 3 5 C 3 0 C 4 2 A 4 0 B 4 0 C 4 2 Analysis of tables • Move from left to right, ALWAYS keeping a sheet that contains the raw data. Une feuille pour les données originales, les autres pour les tableaux-croisés dynamiques ou les graphiques Dynamic graphics and Cross tabulation Data tables vs. Data bases • A data table is organised by lines and by columns. • A data base is a collection of data sheets & tables that allow you to store a great amount of information. Conclusion • • • • Remain logical and structured. Include raw data. Save and back up, frequently. «Comments» sheet (measurement units, abbreviations, personnal notes …) • REMEMBER: The word «DATA» is plural! The word «DATUM» is singular! Questions?