IBDP Internal Assessment: INDIVIDUAL INVESTIGATION - BIOLOGY HL/SL – Ms. Blanka Vrgoc Page 2 Page 3 TABLE OF CONTENTS (with sources): 5 9 11 13 16 18 23 29 30 37 39 41 43 58 74 81 IA lab design & write-up guidelines (Blanka Vrgoc) Marking criteria unpacked (source unknown) Animal experimentation policy (IB) Data presentation in biology (IB) Choosing a statistical test (intro2r) Guidance on the use of criteria (IB/unknown) Sample paper (IB) Sample paper - marks with comments (IB) Sample paper - annotated (IB) IA criteria Grading rubric DRAFT self-assessment (IB/Blanka Vrgoc) Grading rubric DRAFT peer review (IB/Blanka Vrgoc) Grading rubric DRAFT post-peer review self-assessment (IB/Blanka Vrgoc) Statistical booklet 1 (source unknown) 2 (Karl Schauer) 3 (source unknown) 4 (Bio Factsheet) Page 4 Page 5 IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES IN GENERAL: • No student/teacher/school name, nothing identifiable • 6-12 pages in total − Bibliography included − Excessive amounts of raw data can be attached as an appendix beyond the 12 page limit, but the moderators are not required to read that (!), so make sure the 6-12 pages contains everything necessary for understanding (up until and including bibliography) − Page format – not prescribed, but has to be easy to read (e.g. A4, margins 1-2.5 cm, single/double spacing) − Page numbers are always good to have − Table of contents is not required − Word count is not important • Font − Style not prescribed, but has to be a standard easy-to-read one (e.g. Times New Roman, Calibri) − Size not too big and not too small (e.g. font 11±1) • Use appropriate subject-specific scientific terminology, watch out for spelling and grammar(!) • You can use your own photos/drawings (e.g. for specific experimental set-up, to illustrate qualitative results, etc.) The following list should be used as a guideline of what an IA should contain: 1. TITLE What is your IA about? • • Research question or a proper title (RQ rephrased into a statement) Does not need to take up an entire page! 2. INTRODUCTION What do you find interesting that you would like to know more about, and how will you test that? Explain a problem or question to be tested by a scientific investigation. • • Explain a clear and focused reason why you chose to explore what you chose (for your IA) – how you got to the idea and how you developed/adapted a procedure, including observations, citations, or other studies that have lead you to this, why it’s relevant to you and how it’s applicable elsewhere Provide appropriate and relevant scientific background information on the topic • RESEARCH QUESTION • Has to be introduced/restated (depending on the title) as a clear and focused question! • Should include a brief mention of your independent and dependent variables and, if applicable, the name of the organism studied (both scientific and common) • Has to be focused, researchable, answerable, arguable, non-biased (avoid yes/no questions) • HYPOTHESIS Formulate and explain a testable hypothesis using correct scientific reasoning. − Not obligatory, but advisable – should be phrased as a predicted answer to your research question, based on your current knowledge (this is not what you expect to happen, rather what you think will happen according to what you already know) − The RQ and the hypothesis provide for a good anchor to refer to after you’ve conducted your experiment, you seek to test (not prove!) your hypothesis to answer your RQ – so it’s good to explicitly explain your results in the light of the two Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) 1/6 Page 6 IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES − This is an example of a lab-based hypothesis, or other experiments where you are setting up the variables: (DEPENDENT VARIABLE) (INDEPENDENT VARIABLE) If/When ___________________________, then __________________________, (use your knowledge to explain your prediction) because __________________________________________________________. − • VARIABLES List variables and explain how to manipulate them. Explain how sufficient relevant data will be collected. − − − − − − • The variables will vary for ecology or other observational experiments where you are researching what is already happening A table or a bulleted list is preferable Bear in mind that variables as such are not necessarily applicable if you are researching correlation (e.g. in ecology or genetics) INDEPENDENT VARIABLE (min. 5 increments) – what you set up to test the effects of DEPENDENT VARIABLE (min. 5 repetitions/replicates, but the more the better) – specify what data you will collect and how what you will use to measure the consequences of the independent variable (what will change that you can measure) – include the instrument error (half of the smallest unit it can measure) has to be quantitative = measurable (qualitative is immeasurable, but good as a visible/tangible observation) CONTROLLED VARIABLES – everything that (you make sure) will be the same of all experiments (specify how will you control) CONFOUNDING VARIABLES – variables beyond your direct control, but that would have the same influence to all experimental conditions (specify the influence if possible) METHODOLOGY Design a logical, complete and safe method using appropriate materials and equipment. • • • • MATERIALS – list all the major pieces of apparatus, equipment and substances used SET UP – labeled drawing/diagram/photo that shows the apparatus you used (not necessary, but useful for certain specific experiment-based investigations) SAFETY, ETHICAL, ENVIRONMENTAL CONSIDERATIONS – outline any safety concerns & how they will be addressed (Animal Experimentation Policy!), where your materials are coming from, what will happen to them afterwards, how wastes will be disposed of, etc. (if something is not applicable, state so explicitly) if you are doing any experiment involving humans, you need to obtain informed consent forms from them, and this needs to be explicitly addressed (an example of such a form is something you could add as an appendix beyond the 12th page) PROCEDURE – precise list of steps used (passive voice) – the reader should be able to recreate your exact experiment and get the same result (a paragraph is acceptable, but writing the “cookbook” in a numbered list makes it much more clear) Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) 2/6 Page 7 IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES 3. DATA COLLECTION & ANALYSIS TRANSFORMING & PRESENTING DATA – data processing and/or statistical analyses, and any visuals (graphs, etc.) that make the data easier to understand and make sense of Correctly collect, organize, transform and present data in numerical and/or visual forms. • • • RAW DATA COLLECTION − Any qualitative observations made during the experiment should be mentioned − Quantitative data includes raw data collected by measuring the dependent variable (and anything relevant about the controlled variables, if applicable) – usually best displayed in a table − Raw data displayed in a graphic form is still just raw data! (so only use it if it facilitates understanding, avoid repetition for the sake of adding a graph) TRANSFORMING & PROCESSING DATA − Overview of processing should be very short and simply indicate what you did to process the data to facilitate interpretation – state what statistical test was performed and why − No need to include sample calculations (but not forbidden if it’s really important) − STATISTICS (specific tests may or may not be applicable, but some of these or other appropriate ones have to be present): Descriptive: mean, median, mode, % change/difference Treatment of error: range, min/max value, standard deviation, significance of error Statistical tests: t-test/ANOVA/correlation coefficient/χ2/etc. (depends on data collected) If you are using a statistical tool which contains its own “internal” hypothesis, make sure not to mistake or confuse that one with your RQ hypothesis – these are two completely different things! PRESENTING (PROCESSED) DATA − Tables: Numbered in sequence and with a precisely labeled title Well designed and clear – all rows & columns must have headers, units must be given (uncertainties can be given in the title row, underneath the table or as footnotes, as applicable), decimal places must be consistent Try to avoid splitting a table between pages if possible, if not – make sure the title row carries over and it’s clear that it’s the same table continued − Graphs: Carefully chosen type to best and most clearly display the trends in data Numbered in sequence and with a precisely labeled title Axes are labeled and units given (uncertainties can be given underneath the table or as footnotes) − All tables and graphs should be described/explained in the text body (preferably introduced before they are presented), not just put in as stand-alones 4. RESULTS Use your knowledge to interpret the data from your experiment – what do the results mean, what do they tell you? Put the numbers into words. Accurately interpret data and explain results using correct scientific reasoning. • Summarize (briefly describe) what the data (already presented in tables and/or graphs) you observed during the experiment mean – “put the numbers into words” Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) 3/6 Page 8 IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES 5. CONCLUSION Compare the RQ and/or the original hypothesis to the results obtained. Does the data help you answer your RQ? Is the hypothesis supported by the evidence or not? How do the results provide evidence for your conclusion? Give your results context – state whether the they answer your research question or not & use the results to explain why (not). Are your results in line with other people’s research? Evaluate the validity of a hypothesis based on the outcome of a scientific investigation. • • • • Explain what your data really means in a broader context If you wrote a hypothesis, restate it and discuss whether your data supports or rejects is, justify your conclusion through the data obtained Restate your RQ and discuss whether your data does/doesn’t answer it, justify your conclusion through the data obtained Put your results into accepted scientific context – use other published scientific papers on the same topic and compare relevant results to yours 6. EVALUATION What worked well during the experiment? Were there any mistakes in the lab set up or while performing the lab? Was there anything else? Were the results clear enough or did it affect the results? How? Is there something that could be done better or to get better results? Could there be another experimental approach for testing the same hypothesis? Evaluate the validity of the method based on the outcome of a scientific investigation. Explain improvements or extensions to the method that would benefit the scientific investigation. • • • • State the strengths and limitations/weaknesses of your lab design, discuss why Mention any potential preliminary trials conducted and/or modifications to your experiment Offer realistic suggestions that would improve the limitations/weaknesses you identified Provide ideas on what else can be done to further expand the understanding of the matter you researched 7. BIBLIOGRAPHY List all textbooks, scientific papers, etc. you used (quoted) during any of the steps (APA citation format suggested, but any is ok as long as it is consistent). List all the sources you used in your research. • • Best if listed at the very end Keep consistent about the formatting style (APA recommended for science, MLA is also acceptable, as is any other you choose – so long as it is the same throughout the write-up) Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) 4/6 Page 9 IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES MARKING CRITERIA UNPACKED Levels of performance are described using multiple indicators per level. In many cases the indicators occur together in a specific level, but not always. Also, not all indicators are always present. This means that a candidate can demonstrate performances that fit into different levels. To accommodate this, the IB assessment models use markbands and advise examiners and teachers to use a best-fit approach in deciding the appropriate mark for a particular criterion. The indicators per level per criterion can be found in Table 1 on the last page. Additional guidance (what is marked) per criterion: Personal Engagement (2): Individuality, originality, creativity in experiment design, personal interest, independent thinking & research • A statement of purpose • The relationship with the real world • The originality of the design of the method (choice of materials and methods) • Evidence of trial runs • The difficulty of collecting data (evidence of tenacity) • The quality of the observations made • The care in the selection of techniques to process the data • The reflections on the quality of the data • The type of material referred to in the background or in the discussion of the results • The depth of understanding of the limitations in the investigation • The reflections on the improvement and extension of the investigation. Exploration (6): Workable method, focus on the problem, sufficient data, health /safety /ethical /environmental considerations • The protocol for collecting the data • The range and intervals of the independent variable (where applicable) • The selection of measuring instruments (where relevant) • Techniques to ensure adequate control (fair testing) • The use of control experiments • The quantity of data collected, given the nature of the system investigated • The type of data collected • Provision for qualitative observations Safety/Ethical/Environmental Issues: • Evidence of a risk assessment, even if the investigation is considered “safe”. • An appreciation of the safe handling of chemicals or equipment (e.g. the use of protective clothing and eye protection for labs, appropriate gear for a given sport) • An appreciation of the particular safety issues in consideration to some sports • Consideration of basic hygiene • The application of the IB animal experimentation policy • A reasonable use of materials • The use of consent & PAR-Q forms, and a consideration of the welfare of the volunteers • The correct disposal of waste • Attempts to minimize the impact of the investigation on the environment Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) – source unknown 5/6 Page 10 IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES Analysis (6): Recording & processing, treatment of data, analysis of processed data • Carefully selected appropriate statistical tool(s) to identify trends in the data • Calculations carried out with precision Evaluation (6): Conclusion, identification of strengths & weaknesses, improvements & extensions, variability & significance of data • A conclusion that is or implicit or it might refer to • The evaluation of the relative supported by the data specific parts that worked impact of a weakness on the • A conclusion that refers back well or data that was conclusion to the research question (and consistent • Sensible, realistic hypothesis, where • Discussion of the reliability or improvements (with details) applicable) the data • Realistic extensions that • An explanation based upon a • Identified weaknesses in the clearly follow on from the scientific context method and materials investigation • A discussion of the strengths – this might be quite general Communication (4): Subject-specific vocabulary, correct format, graphs & tables quality and labeling, consistency, units & recording of errors, logical and easy to read, consistent referencing • The use of whole pages for titles is not necessary • Table of contents is not necessary • No need for blank data tables presented at the end of the method section • There is often no need for a raw data table as well as a table with processed data (especially if raw data is very excessive) • Raw data relegated to the appendix when there was no reason for it. This upsets the flow of the report • Clear and purposeful data table & graph headers • Avoid splitting a table over two pages, or having a title on one page and the table (or graph) on the next page • Avoid multiple graphs drawn when they could have been combined • Make sure the sizes of graphs are appropriate and easily readable • Skipping bibliography, footnotes, endnotes or intext citation missing will lead to suspected plagiarism • References with an incomplete format (URL alone is not enough) • Adhere by proper scientific vocabulary Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) – source unknown 6/6 Page 11 Guidelines for the use of animals in IB World Schools Why have guidelines for use of animals in the classroom? As respect for animals is a fundamental stepping stone in the development of respect for fellow human beings the IB animal guidelines seek to set out the parameters for the acceptable inclusion of animals in an IB World School. What do the guidelines apply to? These guidelines apply to the treatment of all animals in IB World Schools, to all students at all levels including PYP, MYP, DP and IBCC whether assessed or non- assessed, for extended essays, the group 4 project and the MYP project. The Guidelines cover any work, be it in classrooms or school laboratories, or in the general environment, that is anywhere where IB students may be working. The Guidelines apply to: 1. 2. 3. Keeping animals in schools Animal Experimentation The use of human subjects in investigations The Guidelines Keeping live animals in the classroom Caring for classroom pets can provide a variety of authentic learning contexts for students at almost every level. It presents opportunities for students to develop compassion and empathy towards other living things and take action as a result of this learning. Ultimately the decision to care for a live animal lies with the classroom teacher and time should be taken to adequately research the animal and determine a suitable diet, housing, exercise and socialization for the animal as well as how its care fits into the curriculum. The following should be carefully considered before committing to the care of a classroom pet: Student sensitivity or allergies to particular species, their food or bedding materials Type of animal (domestic rather than wild, not venomous or vicious, diurnal rather than nocturnal etc) Arrangements for housing the animal safely, comfortably, cleanly and in a manner that is not disruptive to the classroom environment Arrangements for appropriate care of the animals over weekends and holidays Long term care of the animal in cases where a future student is allergic or the animal can no longer live in the classroom Additionally, essential agreements should be established regarding when and how the animal is to interact with students. These should ensure the health and safety for both students and the animal (e.g. students wash their hands before and after handling). The nature of the guidelines IB animal experimentation guidelines may be more stringent than some local or national standards for experimentation in schools. Our standards for work in schools should also be more stringent than those of university and research and development committees as we are not carrying out essential, groundbreaking research. Practical work in schools has other purposes such as reinforcing concepts and teaching practical skills and techniques. Even in a practically based extended essay the work will not be fundamental, ground-breaking research. Live animals in experimentation Any planned and actual experimentation involving live animals must be subject to approval by the teacher following a discussion between teacher and student(s) based on the IB guidelines. This discussion should look at the 3Rs principle and the decision justified. The principles are: Replacement Refinement © International Baccalaureate Organization 2015 International Baccalaureate® | Baccalauréat International® | Bachillerato Internacional® Page 12 Reduction Any investigation involving animals should initially consider the replacement of animals with cells or tissues, plants or computer simulations. If the animal is essential to the investigation refinements to the investigation to alleviate any distress to the animal and a reduction in the numbers of animals involved should be made. Experiments involving animals must be based on observing and measuring aspects of natural animal behaviour. Any experimentation should not result in any cruelty to any animal, vertebrate or invertebrate. Therefore experiments that administer drugs or medicines or manipulate the environment or diet beyond that which can be regarded as humane is unacceptable in IB schools. Animal dissection There is no requirement in the PYP, MYP or in the DP group 4 sciences for students to witness or carry out a dissection of any animal, vertebrate or invertebrate. If teachers believe that it is an important educational experience and wish to include dissections in their scheme of work they must apply the following guidelines. The IB does not support animal dissection or the use of animal body parts in the PYP. Discuss reasons for dissections of whole animals with the students. Allow any student who wishes to opt out of the dissection to do so. Seek to reduce the number of dissections. Seek to replace animal dissection with computer simulations and/or use animal tissue, for example, hearts and lungs obtained from butchers, abattoirs or laboratory suppliers. Dissect animals obtained from an ethical source only, for example, no wild animals, animals killed on the road or endangered animals. Experiments involving human subjects Any experimentation involving human subjects must be with their direct, legally obtained written permission and must follow the above guidelines. In addition, the investigation must not use human subjects under the age of 16 without the written consent of the parents or guardians. Subjects must provide written consent The results of the investigation must be anonymous Subjects must participate of their own free will Subjects have the right to withdraw from the investigation at any time. Investigations involving any body fluids must not be performed due to the risk of the transmission of blood-borne pathogens. An exception would be an investigator using their own saliva or sweat. The use of secondary data Secondary data acquired as a result of research that would not be in line with the above policy may be used under certain circumstances: Data acquired by professional researchers. In this case the data would be from research which is written up in academic journals and qualifies as ground breaking. Such research would have been presented to research committees for approval and be licensed. Research which was considered ethical at the time the research was conducted. Our view of animals and their welfare has moved on considerably in recent years. Much research conducted in a different culture would not be granted permission today even though at the time, it was considered acceptable. Data from such sources is acceptable. Some secondary data exists that was considered unethical even within the cultural and historical context of the day. Such data is not acceptable under any circumstances. What happens if the guidelines are not followed? Internal assessment moderators or extended essay examiners who see evidence that the guidelines are not being followed at the school, in the sample work sent for moderation or in extended essays are required to complete a problem report form (PRF) to be submitted to IB Cardiff. Page 2 / 2 © International Baccalaureate Organization 2015 Page 13 Data presentation in biology These guidelines are for HL and SL students for writing up their investigations whether they are assessed or not. The outlines are not prescriptive, but are there to help students produce clear and easy to interpret presentations of their work. Units The international system of units should be used wherever possible, although the main consideration is that units should be fit for purpose. It is, for example, preferable to use minutes rather than seconds in some instances such as when assessing the effect of exercise on heart rate or the rate of transpiration, or cm3 rather than m3 for depicting the volume of carbon dioxide produced by respiring yeast cells. Non metric units such as inches or cups should not be used. Tables Tables are designed to lay out the data ready for analysis. The table should have an explanatory title. “Table of results” is not an explanatory title, whereas “Table to show the time taken to produce 1 cm3 of oxygen at different concentrations of carbon dioxide by Elodea” describes the nature of the data collected. Other points to note are: • • • • • • units should only appear in cell headings rather than in the body of the table error for the instrument used or the accuracy of the reading should appear in the cell heading if relevant the independent variable should be in the first column subsequent columns should show the results for the dependent variable decimal places should be consistent throughout a column mean values should not have more decimal places than the raw data used to produce them. The methods used to process the data should be easy to follow and the processed data may be included in the same table as the raw data, there is no need to separate them. Graphs Graphs should be clear, easy to read and interpret with an explanatory title. If IT software is used, the graph should have clearly identifiable data points and demarcated and labelled axes of a suitable scale. Adjacent data points should be joined by a straight line and the line should start with the first data point and end with the last one, as there should be no extrapolation beyond these points. Lines of best fit are only useful if there is good reason to believe that intermediate points fall on the line between two data points. The usual reason for this is the collection of a large amount of data, which is often not possible given the time constraints of investigations at this level. Likewise, extrapolation of the line will only make sense if there is a large amount of data and a line of best fit is predicted or there is reference made to the literature values. Students should exercise caution when making assumptions. Finally, the type of graph chosen should be appropriate to the nature of the data collected. Error There are sources of error at a number of stages of any investigation. The chosen method should try to address as many as possible by considering the control of variables, but despite this, many will remain. Students should not be discouraged by this, experimental results are only samples (see NOS section 3, “The objectivity of science” in the Biology guide), but rather take them into consideration when analysing the data and drawing conclusions. A thorough evaluation of the sources of uncertainty and error will also help to gain perspective on the investigation in general and to suggest potential improvements and extensions. Random variation and normal variation Page 14 In biological investigations, errors can be caused by changes in the material used or by changes in the conditions under which the experiment is carried out. Biological materials are particularly variable. For example, the water potential of potato tissue may be calculated by soaking pieces of tissue in a range of concentrations of sucrose solutions. However, the pieces of tissue will vary in their water potential, especially if they have been taken from different potatoes. Pieces of tissue taken from the same potato will also show variations in water potential, but they will probably show a normal variation that is less than that from samples taken from different potatoes. Random errors can, therefore, be kept to a minimum by careful selection of material and by careful control of variables. For example, use of a water bath to reduce the random fluctuations in ambient temperature. Human errors Making mistakes is not an acceptable source of error if they could have been easily avoided with more due care and attention. Data loggers can be used if a large number of measurements need to be made, to avoid errors arising due to loss of concentration. Careful planning can help reduce this risk. The act of measuring When a measurement is taken, this can affect the environment of the experiment. For example, when a cold thermometer is put into a test tube with only a small volume of warm water in it, the water will be cooled by the presence of the thermometer so it would be sensible to scale up the volume or have the thermometer in the solution from the start. If the behaviour of animals is being recorded, the presence of the experimenter may influence the animals’ behaviour. Although there are ways to reduce the impact of observer influences, it may have to be something that is taken into account later. Systematic errors Systematic errors can be reduced if equipment is regularly checked or calibrated to ensure that it is functioning correctly. For example, a thermometer should be placed in an electronic water bath to check that the thermostat of the water bath is correctly adjusted. A blank should be used to calibrate a colorimeter to compensate for the drift of the instrument. Degrees of precision and uncertainty in data Students must choose an appropriate instrument for measuring such things as length, volume, pH and light intensity. This does not mean that every piece of equipment needs to be justified, and it can be appreciated that, in a normal science laboratory, the most appropriate instrument may not be available. For the degrees of precision, the simplest rule is that the degree of precision is plus or minus (±) the smallest division on the instrument (the least count). This is true for rulers and instruments with digital displays. The instrument limit of error is usually no greater than the least count and is often a fraction of the least count value. For example, a burette or a mercury thermometer is often read to half of the least count division. This would mean that a burette value of 34.1 cm3 becomes 34.10 cm3 (±0.05 cm3). Note that the volume value is now cited to one extra decimal place so as to be consistent with the uncertainty. The estimated uncertainty takes into account the concepts of least count and instrument limit of error, but also, where relevant, higher levels of uncertainty as indicated by an instrument manufacturer which is usually obtainable online, or qualitative considerations such as parallax problems in reading a thermometer scale, reaction time in starting and stopping a timer, or random fluctuation in an electronic balance read-out. Students should do their best to quantify these observations into the estimated uncertainty. Other protocols exist and no specific protocol is preferred as long as it is clear that recording of uncertainties has been undertaken and the uncertainties are of a sensible and consistent magnitude. Propagating errors Page 15 Propagating errors during data processing is not expected but it is accepted provided the basis of the experimental error is explained. Replicates and samples Biological systems, because of their complexity and normal variability, require replicate observations and multiple samples of material. As a rule of thumb, the lower limit is five measurements within the independent variable, with three runs for each. This will produce five data points for analysis. So in an investigation into the effect of temperature on the rate of reaction of an enzyme, temperature is the independent variable (IV) and the rate of reaction the dependent variable (DV). The IV would need to be assessed three times at five different temperatures at the very least. Obviously, this will vary within the limits of the time available for an investigation. Some simple investigations permit a large number of measurements, or a large number of runs. It is also possible to use class data to generate sufficient replicates to permit adequate processing of the data in class, non-assessed practical work. The standard deviation is the spread of the data around the mean. The larger the standard deviation the wider the spread of data is. Standard deviation is used for normally distributed data. This makes it useful for showing the general variation/uncertainty around a point on a line graph, but it is less helpful for identifying potential anomalies. Error bars that plot the highest and the lowest value for a test, joined up through the mean that will form the data point plotted on the graph with a vertical line, will allow the variation/uncertainty for each data set to be assessed. If the error bars are particularly large, then it may show that the readings taken are unreliable (although reference to the scale might be needed to determine what large actually is). If the error bars overlap with the error bar of a previous or subsequent point, then it would show that the spread of data is too wide to allow for effective discrimination. If trend lines are possible, then adding the coefficient of determination (R2) can be helpful as an indication of how well the trend line fits the data. Statistics An effective presentation of the data goes a long way to assessing whether or not a trend is emerging. This is, however, not the same as using statistics to assess the nature of such a trend and whether it is significant—in other words, whether a trend, judged subjectively from a graph, is actually valid. Students are encouraged to use a statistical test to assess their data, but should briefly explain their choice of test, outline the working hypothesis and put the results of the test into the context of their investigation. For statistical tests the correct protocol should be presented including null and alternative hypotheses, degrees of freedom, critical values and probability levels. intro2r (http://www.intro2r.info/) Updated 2018-10-01 Page 16 (mailto:simon.queenborough@yale.edu) (https://twitter.com/saqueenborough) (https://github.com/intro2r) Choosing a Statistical Test Statistical tests are just tools. Using the correct tool for a specific job is much easier, fun, and useful than using the wrong tool. Learning how to select the correct tool takes practice. Sometimes several different tools could be used and address slightly different questions of nuances to the same question. In some cases there is no single perfect tool and we must settle for an imperfect one, understanding its limitations. Statistics is an area of active research and development, with new tools being developed and tested. Statisticians often disagree about these new tools and how useful they are. There are several ways to approach thinking about what test is most appropriate. The following questions should be useful in guiding what tests are better or worse for your question. 1. What is your (statistical) objective? There are various ways of using statistics. The following list moves from the easiest to the hardest, practically, computationally, and philosophically. Description Describes and characterizes populations and samples using descriptive statistics, graphs, and maps (e.g., opinion surveys, polls, population census) Classification Classifying, identifying and categorizing a sample based on its characters uses descriptive statistics, graphics, and multivariate techniques (e.g., drug effectiveness). Comparison Looks for differences between populations, samples, or reference values using ANOVA and similar tests (e.g, species identification and description, identifying criminals and terrorists). Prediction We can predict future measurements using regression, time series analysis or spatial interpolation (e.g., weather, election results, student success). Explanation Here we look for the most important drivers of variation in the data to try and understand what is going on (a lot of academic research … e.g., does climate change affect species phenology? does conservation intervention x actually work?). 2. How many variables do you have? Do you have one variable, two, more than two, or a lot? 3. What kind of data are they? Discrete Discrete data can only take particular values, with no grey area in between. Categorical/nominal counts or frequencies of things in two or more groups/categories, with no intrinsic ordering to these categories (e.g., eye colour, gender, species). Ordinal counts or frequencies of things in two or more groups/categories, but with intrinsic ordering to these categories (e.g., level of education: elementary school, high school, college, post-graduate). Interval counts or frequencies of things in two or more groups/categories, with intrinsic ordering to these categories, and where the spacing between categories is equal (e.g., equal-sized groups of age: 0-9, 10-19, 20-29, …; income, etc.). Integer Numeric data can be discrete if we are counting whole things (e.g., the number of apples), even if there is potentially an infinite number of these things. Continuous Continuous data are not restricted to any particular set of values. Continuous data can take any value over a continuous range. These data are always essentially numeric (e.g., height, weight, length). Continuous data can be treated as discrete by binning, or putting each value into a specific category that encompasses a range of data. This data is then interval data (see above). 4. Is there a distinction between dependent and independent variables? Is there some a priori reason why you think that one variable has a direct effect on the other? If you are trying to predict or explain something, you are assuming some causality. 5. Are the samples autocorrelated? Observations may be correlated with each other in some way (i.e., autocorrelated). Time if you take repeated measurements of the same thing, e.g., a persons height every year, the temperature every hour, the DBH of a tree every month. Space if you measure the distribution of nutrients or minerals in soil, household income throughout a town. In these two cases, temporal or spatial autocorrelation has little-to-no effect on the coefficient estimates in your statistical models, but it will affect the variance (standard deviation, standard errors) that are calculated, and therefore also any p-values and statistical significance. Sequence if you use the same equipment to take measurements and that equipment slowly drifts out of calibration, or if you take light measurements at the forest floor as the sun comes up. / 1.. Based on the five questions Page 17 2.. Based on the data Source (https://statswithcats.wordpress.com/2010/08/27/the-right-tool-for-the-job/) Source (http://www.efoza.com/postpic/2012/05/statistical-test-flow-chart_237102.jpg) Page 18 Guidance for the use of the internal assessment criteria The internally assessed component of the course is divided into five sections. The sections are differently weighted to emphasize the relative contribution of each aspect to the overall quality of the investigation. Pers. eng. Exploration Analysis Evaluation Communication Total 2 (8%) 6 (25%) 6 (25%) 6 (25%) 4 (17%) 24 (100%) Each section aims to assess a different aspect of the student’s research abilities. As the investigations, and therefore the approaches to the investigation, will be specific to each student, the marking criteria are not designed to be a tick-chart markscheme and each section is meant to be seen within the context of the whole. As such, a certain degree of interpretation is inevitable. The following tips are designed to help focus on the intention of each section, rather than be seen as a definitive approach. Once you’ve completed your IA write-up, go through this handout and self-assess your work. Suggestion: underline the descriptor components you think you have successfully reached/completed, and circle the final mark with the best-fit approach. Personal engagement The emphasis within this section is on individuality and creativity within the investigation. The question to ask is, has the chosen research question been devised as a result of the personal experience of the student? The question could be a result of observations made in the student’s own environment or ideas that the student has had as the result of learning, reading or experimenting in class. The investigation does not have to be ground-breaking research, but there should be an indication that independent thought has been put into the choice of topic, the method of inquiry and the presentation of the findings. The topic chosen should also be of suitable complexity. If the research question is very basic or the answer self-evident then there is little opportunity to gain full marks for exploration and analysis as the student will not have the opportunity to demonstrate his or her skills. This criterion assesses the extent to which the student engages with the exploration and makes it their own. Personal engagement may be recognized in different attributes and skills. These could include addressing personal interests or showing evidence of independent thinking, creativity or initiative in the designing, implementation or presentation of the investigation. Mark Descriptor 0 • The student’s report does not reach a standard described by the descriptors below. 1 • The evidence of personal engagement with the exploration is limited with little independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation does not demonstrate personal significance, interest or curiosity. • There is little evidence of personal input and initiative in the designing, implementation or presentation of the investigation. 2 • The evidence of personal engagement with the exploration is clear with significant independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation demonstrates personal significance, interest or curiosity. • There is evidence of personal input and initiative in the designing, implementation or presentation of the investigation. Exploration Page 19 The issue here is the overall methodology. Students need to take their individual ideas and translate them into a workable method. Students must also demonstrate the thinking behind their ideas using their subject knowledge. The information given must be targeted at the problem rather than being a general account of the topic matter, in order to demonstrate focus on the issues at hand. What needs to be seen is a precise line of investigation that can be assessed using scientific protocols. It is then expected that the student gives the necessary details of the method in terms of variables, controls and the nature of the data that is to be generated. This data must be of sufficient quantity and treatable in an appropriate manner, so that it can generate a conclusion, in order to fulfill the criteria of analysis and evaluation. If the method devised does not lead to sufficient and appropriate data, this will lead to the student being penalized in subsequent sections where this becomes the crux of the assessment. Health and safety is a key consideration in experimental work and forms part of a good method. If the student is working with animals or tissue, it is reasonable to expect there to be evidence that the guidelines for the use of animals in IB World Schools have been read and adhered to. The use of human subjects in experiments is also covered by this policy. If the student is working with chemicals, some explanation of safe handling and disposal would be expected. Full awareness is when all potential hazards have been identified, with a brief outline given as to how they will be addressed. It is only acceptable for there to be no evidence of a risk assessment if the investigation is evidently risk-free—such as in investigations where a database or simulation has been used to generate the data. This criterion assesses the extent to which the student establishes the scientific context for the work, states a clear and focused research question and uses concepts and techniques appropriate to the DP level. Where appropriate, this criterion also assesses awareness of safety, environmental, and ethical considerations. Mark Descriptor 0 • The student’s report does not reach a standard described by the descriptors below. 1–2 • The topic of the investigation is identified and a research question of some relevance is stated but it is not focused. • The background information provided for the investigation is superficial or of limited relevance and does not aid the understanding of the context of the investigation. • The methodology of the investigation is only appropriate to address the research question to a very limited extent since it takes into consideration few of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of limited awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. 3–4 • The topic of the investigation is identified and a relevant but not fully focused research question is described. • The background information provided for the investigation is mainly appropriate and relevant and aids the understanding of the context of the investigation. • The methodology of the investigation is mainly appropriate to address the research question but has limitations since it takes into consideration only some of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of some awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. 5–6 • The topic of the investigation is identified and a relevant and fully focused research question is clearly described. • The background information provided for the investigation is entirely appropriate and relevant and enhances the understanding of the context of the investigation. • The methodology of the investigation is highly appropriate to address the research question because it takes into consideration all, or nearly all, of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of full awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. Analysis Page 20 At the root of this section is the data generated and how it is processed. If there is insufficient data then any treatment will be superficial. It is hoped that a student would recognize such a lack and revisit the method before the analysis is arrived at. Alternatively, the use of databases or simulations to provide sufficient material for analysis could help in such situations. Any treatment of the data must be appropriate to the focus of the investigation in an attempt to answer the research question. The conclusions drawn must be based on the evidence obtained from the data rather than on assumptions. Given the scope of the internal assessment and the time allocated, it is more than likely that variability in the data will lead to a tentative conclusion. This should be recognized and the extent of the variability considered. The variability should be demonstrated and explained and its impact on the conclusion fully acknowledged. It is important to note that, in this criterion, the word “conclusion” refers to a deduction based on direct interpretation of the data, which is based on asking questions such as: What does the graph show? Does any statistical test used support the conclusion? This criterion assesses the extent to which the student’s report provides evidence that the student has selected, recorded, processed and interpreted the data in ways that are relevant to the research question and can support a conclusion. Mark Descriptor 0 • The student’s report does not reach a standard described by the descriptors below. 1–2 • The report includes insufficient relevant raw data to support a valid conclusion to the research question. • Some basic data processing is carried out but is either too inaccurate or too insufficient to lead to a valid conclusion. • The report shows evidence of little consideration of the impact of measurement uncertainty on the analysis. • The processed data is incorrectly or insufficiently interpreted so that the conclusion is invalid or very incomplete. 3–4 • The report includes relevant but incomplete quantitative and qualitative raw data that could support a simple or partially valid conclusion to the research question. • Appropriate and sufficient data processing is carried out that could lead to a broadly valid conclusion but there are significant inaccuracies and inconsistencies in the processing. • The report shows evidence of some consideration of the impact of measurement uncertainty on the analysis. • The processed data is interpreted so that a broadly valid but incomplete or limited conclusion to the research question can be deduced. 5–6 • The report includes sufficient relevant quantitative and qualitative raw data that could support a detailed and valid conclusion to the research question. • Appropriate and sufficient data processing is carried out with the accuracy required to enable a conclusion to the research question to be drawn that is fully consistent with the experimental data. • The report shows evidence of full and appropriate consideration of the impact of measurement uncertainty on the analysis. • The processed data is correctly interpreted so that a completely valid and detailed conclusion to the research question can be deduced. Evaluation Page 21 Although it may appear that the student is asked to repeat the analysis of the data and the drawing of a conclusion again in the evaluation, the focus is different. Once again the data and conclusion come under scrutiny but, in the evaluation, the conclusion is placed into the context of the research question. So, in the analysis, it may be concluded that there is a positive correlation between x and y; in the evaluation, the student is expected to put this conclusion into the context of the original aim. In other words, does the conclusion support the student’s original thinking in the topic? If not, a consideration of why it does not will lead into an evaluation of the limitations of the method and suggestions as to how the method and approach could be adjusted to generate data that could help draw a firmer conclusion. Variability of the data may well be mentioned again in the evaluation as this provides evidence for the reliability of the conclusion. This will also lead into an assessment of the limitations of the method. It is the focus on the limitations that is at issue in the evaluation, rather than a reiteration that there is variability. This criterion assesses the extent to which the student’s report provides evidence of evaluation of the investigation and the results with regard to the research question and the accepted scientific context. Mark Descriptor 0 • The student’s report does not reach a standard described by the descriptors below. 1–2 • A conclusion is outlined which is not relevant to the research question or is not supported by the data presented. • The conclusion makes superficial comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are outlined but are restricted to an account of the practical or procedural issues faced. • The student has outlined very few realistic and relevant suggestions for the improvement and extension of the investigation. 3–4 • A conclusion is described which is relevant to the research question and supported by the data presented. • A conclusion is described which makes some relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are described and provide evidence of some awareness of the methodological issues* involved in establishing the conclusion. • The student has described some realistic and relevant suggestions for the improvement and extension of the investigation. 5–6 • A detailed conclusion is described and justified which is entirely relevant to the research question and fully supported by the data presented. • A conclusion is correctly described and justified through relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are discussed and provide evidence of a clear understanding of the methodological issues* involved in establishing the conclusion. • The student has discussed realistic and relevant suggestions for the improvement and extension of the investigation. Communication Page 22 The marking points for communication take the entire write-up into consideration. If a report is clearly written and logically presented there should be no need for the teacher to re-read it. The information and explanations should be targeted at the question in hand rather than being a general exposition of the subject area; in other words, the report should be focused. The vocabulary should be subject-specific and of a quality appropriate to diploma level. The subject-specific conventions that can be expected are the correct formats for graph and tables and cell headings, correct use of units and the recording of errors. This is not to say that the presentation needs to be faultless to gain full marks. Minor errors are acceptable as long as they do not have a significant bearing on understanding or the interpretation of the results. This criterion assesses whether the investigation is presented and reported in a way that supports effective communication of the focus, process and outcomes. Mark Descriptor 0 • The student’s report does not reach a standard described by the descriptors below. 1–2 • The presentation of the investigation is unclear, making it difficult to understand the focus, process and outcomes. • The report is not well structured and is unclear: the necessary information on focus, process and outcomes is missing or is presented in an incoherent or disorganized way. • The understanding of the focus, process and outcomes of the investigation is obscured by the presence of inappropriate or irrelevant information. • There are many errors in the use of subject-specific terminology and conventions*. 3–4 • The presentation of the investigation is clear. Any errors do not hamper understanding of the focus, process and outcomes. • The report is well structured and clear: the necessary information on focus, process and outcomes is present and presented in a coherent way. • The report is relevant and concise thereby facilitating a ready understanding of the focus, process and outcomes of the investigation. • The use of subject-specific terminology and conventions is appropriate and correct. Any errors do not hamper understanding. Page 23 Investigation 1 Equipment ͻ 10 Petri dishes ͻ 100g of "Yates premium quality" potting mix ͻ 5.00g of hay ͻ 5.00g of Eucalyptus leaves ͻ 5.00g of grass ͻ Electronic weighing scale (±0.01g) ͻ 100 seeds of E. pilularis that are 2.00 mm in diameter (±0.5mm) ͻ 10.0cm ruler (±0.5mm) ͻ 100ml of de-ionized water to create the smoke water ͻ 100ml of de-ionized water to create the control ͻ Tea strainer ͻ 3 x 250ml graduated beaker (±0.4mL) ͻ Matches ͻ 2 Sand baths ͻ 2 thermometers (±0.05°c) International Baccalaureate A study on the effect of smoke water on the germination and growth o f Eucalyptus pilularis Background Australia is a country where bushfires are commonplace during the summer season, and these fires affect much of Australia's flora. As a by-product of this, numerous native Australian plants that inhabit firedependent ecosystems have evolved reproductive strategies to adapt to factors associated with fire. These adaptations that affect their germination can be classified as either physical (derived from the immense heat of the bushfire stimulating a seed to germinate) or chemical (derived from a combination of various chemical elements produced by the smoke that stimulates germination). Aim The aim of this biology laboratory experiment is to explore the effects of smoke water, a mixture of water, burnt plants and hay, and its effect on the germination and post germination growth Eucalyptus pilularis seeds also known as gumnut or blackbutt, an Australian native plant which predominates in forests that are frequently burned. To create the smoke water 1. Place 5g each of the hay, grass and Eucalyptus leaves into one of the 250ml beaker. 2. Ignite the organic matter with a match so that they catch on fire. Let them burn until they are all charred. 3. Measure 100ml of de-ionized water with the second 250ml beakers. Pour this water into the first beaker with the leaves, hay and twigs and leave to infuse for 5 hours. 4. Strain the smoke water mixture into the third measuring beaker using the tea strainer, ensuring that you are only left with the liquid remnants. SAFETY Care should be taken when burning the organic matter, this should be carried out in a ventilated area and the beakers should be made of heat resistance glass. Research question Does smoke water stimulate germination and post germination growth of Eucalyptus pilularis seeds compared to de-ionized water? Prediction Smoke water will successfully germinate more Eucalyptus pilularis than de-ionized water, and thus, as a result of this, the post germination growth of the Eucalyptus piluiaris seeds by the smoke water will be more effective. Effectiveness, for this experiment, is defined as the height of the seedling that emerges from the germinated gumnut seed. If the various chemicals, such as phosphorous and nitrogenous compounds found in the smoky remnants of organic matter function as chemical triggers, then Eucalyptus pilularis will begin its germination out of its dormant state. These phosphorous and nitrogenous compounds, such as NaN03, KN03, NH4Cl and NH4N03, that are naturally occurring in organic matter, are not found in de-ionized water (Dixon et al. 1995), and hence, smoke water is predicted to germinate a larger number of seeds and grow more after germination than de-ionized water 1. Germination and growth 1. Set the sand baths to 30 degrees Celsius and place a thermometer in each one to verify the temperature setting. 2. Place 5 Petri dishes into one sand bath and the remaining 5 Petri dishes into another. One will be our control and one will be our test. 3. Measure out 10 x 10.0g of the potting mix using the electronic weighing scale and place 10.0g into each one of 10 Petri dishes. 5 dishes for smoke water treatment and 5 dishes for de-ionised water treatment. 4. Sow 10 gumnuts into each Petri dish and submerge them into the potting mix at a consistent depth of 0.5cm. Place the seeds towards the edges of the Petri dish so they can be observed through the glass without having to disturb the seeds to observe them. 5. Water the control sand bath at 8:15am with 10ml of de-ionized or smoke water each day for fourteen days. 6. After 14 days, count the number of seeds germinated (distinguished by the emergence of the seedling) and measure the height of the emergent seedling in the test and the control groups with the 10.0cm ruler. The seedling height is measured from the soil surface to the highest part of the stem. 7. Repeat the set up once to ensure sufficient data. Method Preliminary experiment The gumnut seeds were obtained from trees growing in local forestry plantations. It was felt necessary to find out if the gumnut seeds would germinate or not. 1. 50 seeds were planted in 5 Petri dishes of potting mixture (10 seeds per dish). 2. Each dish was watered with 10 ml of de-ionised water and left for two weeks at room temperature. 3. At the end of the two weeks the numbers of seeds germinating was counted. Results Number of seeds germinating = 22/50 Percentage germination = 44% The supply of seeds was considered viable enough to proceed with the experiment. 1 Investigation 1 Controlled Variables ͻ The same volume (10ml) of liquid is added to each dish at the same time (8:15am) each day throughout the 14 days. ͻ All 100 E. pilularis seeds that were used in this experiment were kept within a size range of 2.00 mm in diameter ͻ The water used to create the smoke water was de-ionized water like the control, which allowed consistency between the control and the test groups. http://anpsa.org.au/APOL2/jun96-6.htmI 2 1 Biology teacher support material 1 Biology teacher support material 2 Page 24 Investigation 1 ͻ ͻ ͻ ͻ Investigation 1 Number of seeds successfully germinated The temperature of the seeds was kept constant at 30.0°C by the sand baths. The potting mix for the seeds was from the same brand, "Yates premium potting mix" and the mass of potting mix used for the seeds was kept constant at 10.0g. Same amount of light was assumed to be received for each plant as the experiment was conducted in the same location on the same days. The seeds were placed at a depth of 0.5cm into the soil in the Petri dish. In order to determine the number of seeds that were germinated successfully, the number of seeds that showed distinct cracking of the seed coat and the emergence of the seedling for both the smoke water and the de-ionized water test groups were counted and placed into the table below. The raw data is presented in appendix A. Water Type Trial Numbers germinated (/50) Average % De-ionized 1 26 25 49 2 23 Smoked 1 43 44 88 2 45 The experiment continued for fourteen days to allow for sufficient time to gauge of the effect of the different water types, the manipulated variable. Both sand baths set at the same temperature are placed next to each other, as specified by the method, and they are assumed to be receiving equal amounts of light. The potting mix was taken from the same batch, so all samples could be assumed to contain the same ratio of ingredients. Furthermore, the E. pilularis was submerged into the potting mix at a consistent depth of 0.5cm and towards the edges of the Petri dish to allow for observations to be made through the glass without having to disrupt the seeds to observe them. From the processed data that informs us about the number of seeds successfully germinated, we can clearly see that smoke water germinates, on average. Our method of data collection for this experiment is to count the seeds that successfully germinated from the different Petri dishes in the control and test groups respectively, the measured variable. This is done by observing through the side of the Petri dish whether the seed coat has broken and the seedling has emerged. The other way to collect data in this experiment is to measure the height of the seedlings (from the soil surface to the seedling tip) of the germinated seeds after the 14 days of the experiment. The difference between smoke water and de-ionised water was determined using the F2 test for the germination and the t-test for the growth of the seedlings. Graph of de-ionised water seed germination Percentage de-ionised water gumnut seeds germinated Percentage de-ionised water gumnut seeds NOT germinated Assumptions ͻ The light is of the same intensity because the seeds will be set up side by side. ͻ The de-ionized water contains the same impurities ͻ The potting mix contains the same amount of its constituent components. ͻ The impurities and chemical elements in the air will be the same for both sets of seeds. ͻ The gumnut seeds are all composed of the same percentage of elements. Graph of smoke water seed germination Observations ͻ The E. pilularis seeds were no bigger than 2mm, and were brownish black in colour. There were no obvious signs of previous germination, or cracking of the outer seed coat. ͻ The smoke water was clearly distinctive from the de-ionized water. The de-ionized water was clear, as one would expect if it had been filtered. The smoke water, however, had a blackish, straw coloured hue, due to its absorption of the remnants of the burnt organic matter. ͻ Definite germination was seen on a lot more seeds with the smoke water than with the deionized water. ͻ The E. pilularis subjected to smoke water germinated earlier on average than the seeds subjected to de-ionized water. Seeds with smoke water started showing first signs of germination as early as 7 days, when their seed coats started to split to allow the seedlings to emerge. In comparison, the de-ionized watered seeds took up to 10 days to start showing germination. ͻ The E. pilularis that were germinated by the smoke water tended to have larger seedlings emerging from the split seed coat. ͻ The E. pilularis that were watered with the smoke water had significantly larger cracking of the seed coat, allowing for more space for the seedlings to grow and extend outwards from the shell. ͻ The colour of the seedlings in both experiments was a distinct dark purple colour, and leaves appeared only on the smoke water experiment, with a maximum of 2 small, juvenile leaves found, measuring no more than approximately 50.0mm. Percentage smoke water gumnut seeds germinated Percentage smoke water gumnut seeds NOT germinated 3 Biology teacher support material 4 3 Biology teacher support material 4 Page 25 Investigation 1 The effect of smoke water and de-ionized water on post germination growth ȋ2 test In order to see if there is a significant difference between the germination of the seeds treated with smoke water and de-ionised water a ɍ2 test was carried out. This section of the experiment is designed to test the effectiveness of gumnut seed germination, depending on the type of water it received, either de-ionized or smoke water. Effectiveness was determined by the height of the seedling that emerged from the seed coat of the germinated gumnut seeds. The higher the seedling the more effective the water is on germination. The raw data is presented in appendix A. Height of seedlings for germinated seeds Overall average height Overall standard Trial Water Type Trial Trial average of /mm ±0.5mm deviation seedling height /mm Standard Deviation ±0.5mm De-ionized 1 13.0 13.4 23.4 13.6 2 11.8 13.9 Smoked 1 57.8 24.5 59.5 12.4 2 61.1 22.3 Null Hypothesis: Smoke water does not affect germination of gumnut seeds Alternative Hypothesis: Smoke water affects germination of gumnut seeds Smoke water 88 12 100 Germinated Not germinated Column total De-ionised water 49 51 100 Row total 137 63 200 Proportion of seed germinating = 137/200 = 68.5% Proportion of seeds not germinating = 100 – 68.5 = 31.5% On first observation of the processed data, it can be seen that smoked water clearly has a higher average seedling height than the de-ionized water whilst also having a lower standard deviation. This indicated that the smoked water seeds seedling grew higher than the de-ionized water. The error bars in the graph below suggest that there may be a significant difference between the affects of the treatment on seedling growth. However, the range of variation in the results as given by the standard deviations is large especially for the deionised water treatment trials. To verify this, a t-test was carried out on the data. Expected number of smoke water treated seeds to germinate = 68.5% of 100 = 68.5 Expected number of de-ionised water treated seeds to germinate = 68.5% of 100 = 68.5 Expected number of smoke water treated seeds not to germination = 31.5% of 100 = 31.5 Expected number of de-ionised water treated seeds not to germinate = 31.5% of 100 = 31.5 Expected frequency Difference Positive difference O E O-E IO-EI (IO-EI)2/E 88 49 12 51 68.5 68.5 31.5 31.5 19.5 -19.5 -19.5 19.5 19.5 19.5 19.5 19.5 ɍ2calc 5.55 5.55 12.07 12.07 35.25 The effect of smoke water on the growth of gumnut (Eucalyptus pilularis) seedlings Error bars = ±1 standard deviation 90 80 Average seedling length / mm Observed frequency Investigation 1 70 Number of degrees of freedom = (rows – 1) x (columns – 1) = (2-1) x (2-1) = 1 60 ɍ2crit = 3.84 for p=0.05 ɍ2calc 50 ɍ2crit = Since the test value for = 35.25 is a lot greater than the critical value 3.84 we must reject the Null Hypothesis and accept the Alternative Hypothesis. The test value is significant for p < 0.001 40 30 20 10 0 Smoke water De-ionised water Treatment 6 5 Biology teacher support material 5 Biology teacher support material 6 Page 26 Investigation 1 t-test In order to statistically test whether the shoot of smoke water germinated gumnut seedlings grew more than the de-ionized water, a two-tailed t-test for independent samples was carried out to investigate whether there is a significant difference between the growth of the seedlings. ͻ Null Hypothesis - the smoke water has no effect on post germination growth of the gumnut seedlings. ͻ Alternative Hypothesis - the smoke water does have an effect on post germination growth of the gumnut seedlings. Investigation 1 Conclusion In conclusion, the experiment supported my hypothesis that smoke water will successfully germinate more Eucalyptus pilularis than de-ionized water. Furthermore, the subsequent growth of the Eucalyptus pilularis seeds by the smoke water was found to be more effective than the de-ionized water due to the significantly taller seedlings of the Eucalyptus pilularis that were exposed to the smoke water. This could because the various chemicals, such as phosphorous and nitrogenous compounds found in the smoky remnants of the burnt organic matter (in my case, the burnt leaves, hay and twigs) acted as chemical triggers for the E. pilularis to begin its germination out of its dormant state and stimulate its subsequent growth. While all of the active compounds in smoke have not yet been identified, a large majority of the compounds present in the smoke water mixture (NaN03, KN03, NH4CI and NH4N03) are water soluble, thus they are easily able to be taken in by the gumnut seed and, once inside the seed, they are used as these so called "chemical triggers” to start germination. These chemical triggers work by altering the levels of chemicals that the seed maintains in homeostasis, once the seed has registered these differing levels of phosphorous and nitrogenous compounds, it stimulates the germination of the seed. There are, however, compounds called butenolides that have confirmed germination-promoting action. These butenolides are produced by some plants on exposure to high temperatures and smoke caused by bush fires. In particular, botanists Flematti, Ghisalberti, Dixon and Trengove isolated a particular butenolide called 3methyl-2H-furo[2,3-c]pyran-2-one, which was found to trigger seed germination in plants whose reproduction is fire-dependent, such as the E. pilularis used in my experiment 3. One theory about how this butenolide called 3methyl-2tf-furo[2,3-c@pyran-2-one is formed by the plant is given to us by Light, Berger and van Steden, who hypothesized that this particular butenolide was created from cellulose within the plant, and this substance, created by the cellulose, stimulated the seeds reproductive cycle, and hence, germination 4. The two pie graphs that show the percentage of seeds germinated for the smoke water experiment and de-ionized water experiment respectively, furthermore indicate that my hypothesis was correct, with 88% of the smoke watered seeds successfully germinating compared to only 47% of the de-ionized water seeds germinating. This was backed up ǁŝƚŚ ŵLJ ʖ2-test that accurately concluded that we could reject the null hypothesis, with a 95% degree of confidence, that the smoke water successfully germinated more seeds that the de-ionized water. The t-test on the seedling growth shows that the smoke water has a significant positive effect on the gumnut seedlings. t-test formula: degrees of freedom = n1 + n2 – 1 = 198 tcalc = 17.4 tcrit (p=0.05) = 1.97 Because our test t value tcalc = 17.4 is greater than the critical value tcrit =1.97 at p = 0.05, we can accept the alternative hypothesis, that the smoke water significantly stimulates the growth of the gumnut seedlings germinated. The test value is significant for p < 0.001 Evaluation of Weaknesses with suggested improvements The potting mixture used was obtained from the local garden shop, and whilst the same brand and the same amount of the potting mixture was used for both seeds in the experiment, the potting mixture may have contained impurities which could potentially have enhanced or reduced the ability of the seeds to germinate, especially because the Yates brand "Contains trace elements to add extra vital nutrients" 2. Some of the chemicals from the smoke water also could have potentially reacted with some of the ingredients of the potting mix and rendered them useless, however the seeds watered with de-ionized water may not have had this potential problem. To improve this, I could have used a different support for the seeds such as cotton wool or filter paper. Using different types of leaves, twigs and hay to create the smoke water would give you different chemicals, as each has a differing composition of chemicals, some of which may be beneficial for germination, and some of which wouldn't. For this experiment, I could have used only one variable like hay, instead of twigs and leaves as well. This would narrow my scope of results down as well and I would potentially be able to pinpoint the specific chemical, or source of the chemical, that allows gumnuts to germinate successfully. It may be found that twigs, for example, don't enhance seed germination but leaves do. By singling out the element that best enhances seed germination, further experiments could be carried out, and the exact chemical could be identified, that best enhances the seeds germination. Bibliography Yates Gardening Ltd Sydney Australia http://www.yates.com.au/products/pots-and-potting-mix/all-purposepotting-mix/yates-premium-potting-mix/ Last visited July 10 2011 Gavin R. Flematti, Emilio L. Ghisalberti, Kingsley W. Dixon and Robert D. Trengove A Compound from Smoke That Promotes Seed Germination http://www.sciencemag.org/content/305/5686/977 Science 13 August 2004: Vol. 305 no. 5686 p. 977Published Online July 8 2004 Combined with this, I could have used gumnut seeds that were all the same weight rather than the same size in diameter. I tried to use gumnut seeds that were only 2.00mm in diameter, however it would have been better served to use seeds that all had a constant weight of 0.2g for example, as then I could have assumed that each seed contained the same amounts and composition of nutrients, enzymes and other chemicals inside it. Marnie E. Light, Barend V. Burger and Johannes van Staden Formation of a Seed Germination Promoter from Carbohydrates and Amino Acids http://pubs.acs.org/doi/abs/10.1021/jf050710u J. Agric. Food Chem., 2005, 53 (15), pp 5936–5942 Publication Date (Web): July 1, 2005 To further narrow my scope of the experiment, I could have tested the effects of different concentrations of the smoke water as well. Instead of only using a 1:10 ratio of 1 part twigs, hay and leaves to 10 parts de-ionized water, I could have tested a ratio of 1:5 with 1 part twigs, hay and leaves and 5 parts de- ionized water. Working out the optimum concentration of smoke water would help this experiment as better and clearer results could be obtained. 3 2 http://www.yates.com.au/products/pots-and-potting-mix/all-purpose-potting-mix/yates-premium-potting-mix/ 4 7 http://www.sciencemag.org/content/305/5686/977 http//pubs.acs.org/doi/abs/10.1021/jf050710u 8 Biology teacher support material 7 Biology teacher support material 8 Page 27 Investigation 1 Investigation 1 Seeds watered with De-ionized water (Trial 1 ) Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Appendix A - raw data tables Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Seeds watered with Smoke Water (Trial 1) Did the seed Germinate Height of seedling in / mm ±0.5mm Yes 56.0 Yes 71.0 Yes 73.0 Yes 67.0 Yes 54.0 No 0 Yes 58.0 Yes 70.0 Yes 66.0 Yes 61.0 Yes 64.0 Yes 71.0 No 0 No 0 Yes 59.0 Yes 67.0 Yes 58.0 Yes 63.0 Yes 62.0 Yes 64.0 Yes 72.0 Yes 75.0 No 0.0 Yes 68.0 Yes 64.0 Yes 69.0 Yes 70.0 No 0 Yes 52.0 No 0 Yes 79.0 Yes 81.0 Yes 83.0 Yes 74.0 Yes 74.0 Yes 78.0 Yes 63.0 Yes 69.0 Yes 58.0 Yes 70.0 Yes 68.0 Yes 62.0 Yes 63.0 Yes 68.0 Yes 58.0 Yes 81.0 Yes 68.0 Yes 73.0 Yes 67.0 No 0 Did the seed Germinate Yes Yes Yes No No No Yes No Yes No Yes No No Yes Yes No Yes No Yes Yes Yes No No Yes No Yes Yes No Yes Yes No No Yes Yes No No Yes Yes No No No Yes No Yes No No Yes Yes No Yes Height of seedling in / mm ±0.5mm 18 27.0 19.0 0 0 0 24.0 0 25.0 0 28.0 0 0 17.0 23.0 0 16.0 0 26.0 27.0 15.0 0 0 27.0 0 21.0 22.0 0 27.0 37.0 0 0 26.0 31.0 0 0 27.0 41.0 0 0 0 25.0 0 19.0 0 0 37.0 22.0 0 25.0 9 10 Biology teacher support material 9 Biology teacher support material 10 Page 28 Investigation 1 Investigation 1 Seeds watered with De-Ionized water (Trial 2) Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Seeds watered with Smoke water (Trial 2) Did the seed Germinate Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Height of seedling in mm / ±0.5mm 72.0 73.0 0 72.0 57.0 74.0 79.0 62.0 78.0 64.0 72.0 79.0 72.0 57.0 56.0 83.0 63.0 0 72.0 63.0 0 58.0 81.0 57.0 62.0 0 74.0 73.0 83.0 58.0 74.0 57.0 63.0 79.0 60.0 74.0 79.0 57.0 86.0 53.0 56.0 67.0 63.0 68.0 54.0 68.0 68.0 0 62.0 72.0 11 Biology teacher support material Did the seed Germinate No Yes Yes Yes No No No No Yes No Yes No No Yes Yes No No No Yes Yes Yes No No Yes No Yes Yes No Yes Yes No No Yes Yes No No Yes Yes No No No Yes No Yes No No Yes Yes No No Height of seedling in / mm ±0.5mm 0 26.0 21.0 23.0 0 0 0 0 31.0 0 14.0 0 0 16.0 18.0 0 0 0 26.0 31.0 25.0 0 0 21.0 0 31.0 26.0 0 23.0 36.0 0 0 14.0 23.0 0 0 23.0 27.0 0 0 0 24.0 0 45.0 0 0 42.0 23.0 0 0 12 11 Biology teacher support material 12 Page 29 Page 30 Investigation 1 (annotated) A study on the effect of smoke water on the germination and growth o f Eucalyptus pilularis Background Australia is a country where bushfires are commonplace during the summer season, and these fires affect much of Australia's flora. As a by-product of this, numerous native Australian plants that inhabit firedependent ecosystems have evolved reproductive strategies to adapt to factors associated with fire. These adaptations that affect their germination can be classified as either physical (derived from the immense heat of the bushfire stimulating a seed to germinate) or chemical (derived from a combination of various chemical elements produced by the smoke that stimulates germination). Aim The aim of this biology laboratory experiment is to explore the effects of smoke water, a mixture of water, burnt plants and hay, and its effect on the germination and post germination growth Eucalyptus pilularis seeds also known as gumnut or blackbutt, an Australian native plant which predominates in forests that are frequently burned. Comm: Overall the report is clear, concise and logically structured. Comm:Subject specific terminology and notation are used throughout. PE:Student shows a high degree of engagement with the investigation. EX:Investigation set in context and justified. EX: Smoke water defined EX:Research question focussed EX:Methodology appropriate Prediction Smoke water will successfully germinate more Eucalyptus pilularis than de-ionized water, and thus, as a result of this, the post germination growth of the Eucalyptus piluiaris seeds by the smoke water will be more effective. Effectiveness, for this experiment, is defined as the height of the seedling that emerges from the germinated gumnut seed. If the various chemicals, such as phosphorous and nitrogenous compounds found in the smoky remnants of organic matter function as chemical triggers, then Eucalyptus pilularis will begin its germination out of its dormant state. These phosphorous and nitrogenous compounds, such as NaN03, KN03, NH4Cl and NH4N03, that are naturally occurring in organic matter, are not found in de-ionized water (Dixon et al. 1995), and hence, smoke water is predicted to germinate a larger number of seeds and grow more after germination than de-ionized water 1. Ex: Defines method to collect relevant data Preliminary experiment The gumnut seeds were obtained from trees growing in local forestry plantations. It was felt necessary to find out if the gumnut seeds would germinate or not. 1. 50 seeds were planted in 5 Petri dishes of potting mixture (10 seeds per dish). 2. Each dish was watered with 10 ml of de-ionised water and left for two weeks at room temperature. 3. At the end of the two weeks the numbers of seeds germinating was counted. Results Number of seeds germinating = 22/50 Percentage germination = 44% The supply of seeds was considered viable enough to proceed with the experiment. Ex: Method can be easily followed and repeated by others. Ex : Anticipates that method may need modifying.Sufficient data is planned for Ex: Suitable control An: Data displayed from trial run An: Appropriate processing Ev : Conclusion made from trial run. 1 Equipment ͻ 10 Petri dishes ͻ 100g of "Yates premium quality" potting mix ͻ 5.00g of hay ͻ 5.00g of Eucalyptus leaves ͻ 5.00g of grass ͻ Electronic weighing scale (±0.01g) ͻ 100 seeds of E. pilularis that are 2.00 mm in diameter (±0.5mm) ͻ 10.0cm ruler (±0.5mm) ͻ 100ml of de-ionized water to create the smoke water ͻ 100ml of de-ionized water to create the control ͻ Tea strainer ͻ 3 x 250ml graduated beaker (±0.4mL) ͻ Matches ͻ 2 Sand baths ͻ 2 thermometers (±0.05°c) To create the smoke water 1. Place 5g each of the hay, grass and Eucalyptus leaves into one of the 250ml beaker. 2. Ignite the organic matter with a match so that they catch on fire. Let them burn until they are all charred. 3. Measure 100ml of de-ionized water with the second 250ml beakers. Pour this water into the first beaker with the leaves, hay and twigs and leave to infuse for 5 hours. 4. Strain the smoke water mixture into the third measuring beaker using the tea strainer, ensuring that you are only left with the liquid remnants. SAFETY Care should be taken when burning the organic matter, this should be carried out in a ventilated area and the beakers should be made of heat resistance glass. Research question Does smoke water stimulate germination and post germination growth of Eucalyptus pilularis seeds compared to de-ionized water? Method Investigation 1 (annotated) http://anpsa.org.au/APOL2/jun96-6.htmI Germination and growth 1. Set the sand baths to 30 degrees Celsius and place a thermometer in each one to verify the temperature setting. 2. Place 5 Petri dishes into one sand bath and the remaining 5 Petri dishes into another. One will be our control and one will be our test. 3. Measure out 10 x 10.0g of the potting mix using the electronic weighing scale and place 10.0g into each one of 10 Petri dishes. 5 dishes for smoke water treatment and 5 dishes for de-ionised water treatment. 4. Sow 10 gumnuts into each Petri dish and submerge them into the potting mix at a consistent depth of 0.5cm. Place the seeds towards the edges of the Petri dish so they can be observed through the glass without having to disturb the seeds to observe them. 5. Water the control sand bath at 8:15am with 10ml of de-ionized or smoke water each day for fourteen days. 6. After 14 days, count the number of seeds germinated (distinguished by the emergence of the seedling) and measure the height of the emergent seedling in the test and the control groups with the 10.0cm ruler. The seedling height is measured from the soil surface to the highest part of the stem. 7. Repeat the set up once to ensure sufficient data. Controlled Variables ͻ The same volume (10ml) of liquid is added to each dish at the same time (8:15am) each day throughout the 14 days. ͻ All 100 E. pilularis seeds that were used in this experiment were kept within a size range of 2.00 mm in diameter ͻ The water used to create the smoke water was de-ionized water like the control, which allowed consistency between the control and the test groups. 1 Biology teacher support material Ex: Safety risks assessed Ex: Plans for sufficient data Ex: Plans for sufficient data Comm: Correct definition of germination Ex: Plans for sufficient data Ex: Thorough consideration of the other factors that may influence the investigation 2 1 Biology teacher support material 2 Page 31 Investigation 1 (annotated) ͻ ͻ ͻ ͻ Number of seeds successfully germinated The temperature of the seeds was kept constant at 30.0°C by the sand baths. The potting mix for the seeds was from the same brand, "Yates premium potting mix" and the mass of potting mix used for the seeds was kept constant at 10.0g. Same amount of light was assumed to be received for each plant as the experiment was conducted in the same location on the same days. The seeds were placed at a depth of 0.5cm into the soil in the Petri dish. In order to determine the number of seeds that were germinated successfully, the number of seeds that showed distinct cracking of the seed coat and the emergence of the seedling for both the smoke water and the de-ionized water test groups were counted and placed into the table below. The raw data is presented in appendix A. Water Type Trial Numbers germinated (/50) Average % De-ionized 1 26 25 49 2 23 Smoked 1 43 44 88 2 45 The experiment continued for fourteen days to allow for sufficient time to gauge of the effect of the different water types, the manipulated variable. Both sand baths set at the same temperature are placed next to each other, as specified by the method, and they are assumed to be receiving equal amounts of light. The potting mix was taken from the same batch, so all samples could be assumed to contain the same ratio of ingredients. Furthermore, the E. pilularis was submerged into the potting mix at a consistent depth of 0.5cm and towards the edges of the Petri dish to allow for observations to be made through the glass without having to disrupt the seeds to observe them. Our method of data collection for this experiment is to count the seeds that successfully germinated from the different Petri dishes in the control and test groups respectively, the measured variable. This is done by observing through the side of the Petri dish whether the seed coat has broken and the seedling has emerged. The other way to collect data in this experiment is to measure the height of the seedlings (from the soil surface to the seedling tip) of the germinated seeds after the 14 days of the experiment. The difference between smoke water and de-ionised water was determined using the F2 test for the germination and the t-test for the growth of the seedlings. Investigation 1 (annotated) From the processed data that informs us about the number of seeds successfully germinated, we can clearly see that smoke water germinates, on average. Graph of de-ionised water seed germination Comm: Data analysis can be followed (no need for a worked example here) An: Uncertainties missing but not considered relevant here for a count. However uncertainties ±2% could have featured for the percentage germination data. An: Appropriate graphical presentation of processed data Percentage de-ionised water gumnut seeds germinated An: Appropriate method of analysis chosen Percentage de-ionisedwater gumnut seeds NOT germinated Assumptions ͻ The light is of the same intensity because the seeds will be set up side by side. ͻ The de-ionized water contains the same impurities ͻ The potting mix contains the same amount of its constituent components. ͻ The impurities and chemical elements in the air will be the same for both sets of seeds. ͻ The gumnut seeds are all composed of the same percentage of elements. Comm: Clear presentation of graph Observations ͻ The E. pilularis seeds were no bigger than 2mm, and were brownish black in colour. There were no obvious signs of previous germination, or cracking of the outer seed coat. ͻ The smoke water was clearly distinctive from the de-ionized water. The de-ionized water was clear, as one would expect if it had been filtered. The smoke water, however, had a blackish, straw coloured hue, due to its absorption of the remnants of the burnt organic matter. ͻ Definite germination was seen on a lot more seeds with the smoke water than with the deionized water. ͻ The E. pilularis subjected to smoke water germinated earlier on average than the seeds subjected to de-ionized water. Seeds with smoke water started showing first signs of germination as early as 7 days, when their seed coats started to split to allow the seedlings to emerge. In comparison, the de-ionized watered seeds took up to 10 days to start showing germination. ͻ The E. pilularis that were germinated by the smoke water tended to have larger seedlings emerging from the split seed coat. ͻ The E. pilularis that were watered with the smoke water had significantly larger cracking of the seed coat, allowing for more space for the seedlings to grow and extend outwards from the shell. ͻ The colour of the seedlings in both experiments was a distinct dark purple colour, and leaves appeared only on the smoke water experiment, with a maximum of 2 small, juvenile leaves found, measuring no more than approximately 50.0mm. An: Adequate qualitative observations made 3 Biology teacher support material Comm: Data table set in context. Clear, unambiguous presentation 4 3 Biology teacher support material 4 Page 32 Investigation 1 (annotated) ȋ2 test In order to see if there is a significant difference between the germination of the seeds treated with smoke water and de-ionised water a ȋ2 test was carried out. The effect of smoke water and de-ionized water on post germination growth This section of the experiment is designed to test the effectiveness of gumnut seed germination, depending on the type of water it received, either de-ionized or smoke water. Effectiveness was determined by the height of the seedling that emerged from the seed coat of the germinated gumnut seeds. The higher the seedling the more effective the water is on germination. The raw data is presented in appendix A. Height of seedlings for germinated seeds Overall average height Overall standard Trial Water Type Trial Trial average of /mm ±0.5mm deviation seedling height /mm Standard Deviation ±0.5mm De-ionized 1 13.0 13.4 23.4 13.6 2 11.8 13.9 Smoked 1 57.8 24.5 59.5 12.4 2 61.1 22.3 Null Hypothesis: Smoke water does not affect germination of gumnut seeds Alternative Hypothesis: Smoke water affects germination of gumnut seeds Smoke water 88 12 100 Germinated Not germinated Column total De-ionised water 49 51 100 Row total 137 63 200 Proportion of seed germinating = 137/200 = 68.5% Proportion of seeds not germinating = 100 – 68.5 = 31.5% Difference Positive difference O E O-E IO-EI (IO-EI)2/E 88 49 12 51 68.5 68.5 31.5 31.5 19.5 -19.5 -19.5 19.5 19.5 19.5 19.5 19.5 ȋ2calc 5.55 5.55 12.07 12.07 35.25 An: The candidate considers the reliability of the data though it could be argued that ungerminated seeds should not be included here. These results (0cm growth) skew the distribution so that it is not normally distributed. 90.0 Number of degrees of freedom = (rows – 1) x (columns – 1) = (2-1) x (2-1) = 1 ȋ2crit = 3.84 for p=0.05 Comm:ata processing can be followed. Since the test value for ȋ2calc = 35.25 is a lot greater than the critical value ȋ2crit = 3.84 we must reject the Null Hypothesis and accept the Alternative Hypothesis. The test value is significant for p < 0.001 An: Processed data correctly interpreted 80.0 70.0 60.0 50.0 40.0 30.0 20.0 10.0 0.0 An: Successful data analysis completed. Conclusion can be deduced. Smoke water De-ionised water Treatment 6 5 Biology teacher support material Comm: Data table set in context. Clear, unambiguous presentation. Processing can be followed, a worked example is not expected here. Processing can be followed. Correct conventions for uncertainties The effect of smoke water on the growth of gumnut (Eucalyptus pilularis) seedlings Error bars = ±1 standard deviation Average seedling length / mm Expected frequency Comm : Terminology is imprecise here. Strictly speaking this is post germination growth On first observation of the processed data, it can be seen that smoked water clearly has a higher average seedling height than the de-ionized water whilst also having a lower standard deviation. This indicated that the smoked water seeds seedling grew higher than the de-ionized water. The error bars in the graph below suggest that there may be a significant difference between the affects of the treatment on seedling growth. However, the range of variation in the results as given by the standard deviations is large especially for the deionised water treatment trials. To verify this, a t-test was carried out on the data. Expected number of smoke water treated seeds to germinate = 68.5% of 100 = 68.5 Expected number of de-ionised water treated seeds to germinate = 68.5% of 100 = 68.5 Expected number of smoke water treated seeds not to germination = 31.5% of 100 = 31.5 Expected number of de-ionised water treated seeds not to germinate = 31.5% of 100 = 31.5 Observed frequency Investigation 1 (annotated) 5 Biology teacher support material 6 Page 33 Investigation 1 (annotated) t-test In order to statistically test whether the shoot of smoke water germinated gumnut seedlings grew more than the de-ionized water, a two-tailed t-test for independent samples was carried out to investigate whether there is a significant difference between the growth of the seedlings. ͻ Null Hypothesis - the smoke water has no effect on post germination growth of the gumnut seedlings. ͻ Alternative Hypothesis - the smoke water does have an effect on post germination growth of the gumnut seedlings. An: Appropriate method of analysis chosen. t-test formula: degrees of freedom = n1 + n2 – 1 = 198 tcalc = 17.4 tcrit (p=0.05) = 1.97 Because our test t value tcalc = 17.4 is greater than the critical value tcrit =1.97 at p = 0.05, we can accept the alternative hypothesis, that the smoke water significantly stimulates the growth of the gumnut seedlings germinated. The test value is significant for p < 0.001 Evaluation of Weaknesses with suggested improvements The potting mixture used was obtained from the local garden shop, and whilst the same brand and the same amount of the potting mixture was used for both seeds in the experiment, the potting mixture may have contained impurities which could potentially have enhanced or reduced the ability of the seeds to germinate, especially because the Yates brand "Contains trace elements to add extra vital nutrients" 2. Some of the chemicals from the smoke water also could have potentially reacted with some of the ingredients of the potting mix and rendered them useless, however the seeds watered with de-ionized water may not have had this potential problem. To improve this, I could have used a different support for the seeds such as cotton wool or filter paper. Comm: Processing can be followed. An: Successful data analysis and interpretation completed Ev: Student considers the reliability of the data and considers the impact of experimental uncertainty Ev: Sensible suggested improvement. Using different types of leaves, twigs and hay to create the smoke water would give you different chemicals, as each has a differing composition of chemicals, some of which may be beneficial for germination, and some of which wouldn't. For this experiment, I could have used only one variable like hay, instead of twigs and leaves as well. This would narrow my scope of results down as well and I would potentially be able to pinpoint the specific chemical, or source of the chemical, that allows gumnuts to germinate successfully. It may be found that twigs, for example, don't enhance seed germination but leaves do. By singling out the element that best enhances seed germination, further experiments could be carried out, and the exact chemical could be identified, that best enhances the seeds germination. Ev: Feasible extension proposed. Combined with this, I could have used gumnut seeds that were all the same weight rather than the same size in diameter. I tried to use gumnut seeds that were only 2.00mm in diameter, however it would have been better served to use seeds that all had a constant weight of 0.2g for example, as then I could have assumed that each seed contained the same amounts and composition of nutrients, enzymes and other chemicals inside it. Ev: Suggested improvement impractical To further narrow my scope of the experiment, I could have tested the effects of different concentrations of the smoke water as well. Instead of only using a 1:10 ratio of 1 part twigs, hay and leaves to 10 parts de-ionized water, I could have tested a ratio of 1:5 with 1 part twigs, hay and leaves and 5 parts de- ionized water. Working out the optimum concentration of smoke water would help this experiment as better and clearer results could be obtained. Investigation 1 (annotated) Conclusion In conclusion, the experiment supported my hypothesis that smoke water will successfully germinate more Eucalyptus pilularis than de-ionized water. Furthermore, the subsequent growth of the Eucalyptus pilularis seeds by the smoke water was found to be more effective than the de-ionized water due to the significantly taller seedlings of the Eucalyptus pilularis that were exposed to the smoke water. This could because the various chemicals, such as phosphorous and nitrogenous compounds found in the smoky remnants of the burnt organic matter (in my case, the burnt leaves, hay and twigs) acted as chemical triggers for the E. pilularis to begin its germination out of its dormant state and stimulate its subsequent growth. While all of the active compounds in smoke have not yet been identified, a large majority of the compounds present in the smoke water mixture (NaN03, KN03, NH4CI and NH4N03) are water soluble, thus they are easily able to be taken in by the gumnut seed and, once inside the seed, they are used as these so called "chemical triggers” to start germination. These chemical triggers work by altering the levels of chemicals that the seed maintains in homeostasis, once the seed has registered these differing levels of phosphorous and nitrogenous compounds, it stimulates the germination of the seed. There are, however, compounds called butenolides that have confirmed germination-promoting action. These butenolides are produced by some plants on exposure to high temperatures and smoke caused by bush fires. In particular, botanists Flematti, Ghisalberti, Dixon and Trengove isolated a particular butenolide called 3methyl-2H-furo[2,3-c]pyran-2-one, which was found to trigger seed germination in plants whose reproduction is fire-dependent, such as the E. pilularis used in my experiment 3. One theory about how this butenolide called 3methyl-2tf-furo[2,3-c@pyran-2-one is formed by the plant is given to us by Light, Berger and van Steden, who hypothesized that this particular butenolide was created from cellulose within the plant, and this substance, created by the cellulose, stimulated the seeds reproductive cycle, and hence, germination 4. The two pie graphs that show the percentage of seeds germinated for the smoke water experiment and de-ionized water experiment respectively, furthermore indicate that my hypothesis was correct, with 88% of the smoke watered seeds successfully germinating compared to only 47% of the de-ionized water seeds germinating. This was backed up wiWK P\ Ȥ2-test that accurately concluded that we could reject the null hypothesis, with a 95% degree of confidence, that the smoke water successfully germinated more seeds that the de-ionized water. The t-test on the seedling growth shows that the smoke water has a significant positive effect on the gumnut seedlings. Ev : Successful interpretation of the results. Relevant justified conclusion drawn Bibliography Yates Gardening Ltd Sydney Australia http://www.yates.com.au/products/pots-and-potting-mix/all-purposepotting-mix/yates-premium-potting-mix/ Last visited July 10 2011 Gavin R. Flematti, Emilio L. Ghisalberti, Kingsley W. Dixon and Robert D. Trengove A Compound from Smoke That Promotes Seed Germination http://www.sciencemag.org/content/305/5686/977 Science 13 August 2004: Vol. 305 no. 5686 p. 977Published Online July 8 2004 Marnie E. Light,Barend V. Burger and Johannes van Staden Formation of a Seed Germination Promoter from Carbohydrates and Amino Acids http://pubs.acs.org/doi/abs/10.1021/jf050710u J. Agric. Food Chem., 2005, 53 (15), pp 5936–5942 Publication Date (Web): July 1, 2005 Ev: Unsafe assumption Ev: Feasible extension proposed. 3 4 2 http://www.yates.com.au/products/pots-and-potting-mix/all-purpose-potting-mix/yates-premium-potting-mix/ 7 Biology teacher support material Ev: Compare to relevant scientific theory http://www.sciencemag.org/content/305/5686/977 http//pubs.acs.org/doi/abs/10.1021/jf050710u 8 7 Biology teacher support material 8 Page 34 Investigation 1 (annotated) Investigation 1 (annotated) Seeds watered with De-ionized water (Trial 1 ) Appendix A - raw data tables Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 An: Raw data recorded includes uncertainties Seeds watered with Smoke Water (Trial 1) Did the seed Germinate Height of seedling in / mm ±0.5mm Yes 56.0 Yes 71.0 Yes 73.0 Yes 67.0 Yes 54.0 No 0 Yes 58.0 Yes 70.0 Yes 66.0 Yes 61.0 Yes 64.0 Yes 71.0 No 0 No 0 Yes 59.0 Yes 67.0 Yes 58.0 Yes 63.0 Yes 62.0 Yes 64.0 Yes 72.0 Yes 75.0 No 0.0 Yes 68.0 Yes 64.0 Yes 69.0 Yes 70.0 No 0 Yes 52.0 No 0 Yes 79.0 Yes 81.0 Yes 83.0 Yes 74.0 Yes 74.0 Yes 78.0 Yes 63.0 Yes 69.0 Yes 58.0 Yes 70.0 Yes 68.0 Yes 62.0 Yes 63.0 Yes 68.0 Yes 58.0 Yes 81.0 Yes 68.0 Yes 73.0 Yes 67.0 No 0 Comm: Decimal places should be consistent Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Did the seed Germinate Yes Yes Yes No No No Yes No Yes No Yes No No Yes Yes No Yes No Yes Yes Yes No No Yes No Yes Yes No Yes Yes No No Yes Yes No No Yes Yes No No No Yes No Yes No No Yes Yes No Yes Height of seedling in / mm ±0.5mm 18 27.0 19.0 0 0 0 24.0 0 25.0 0 28.0 0 0 17.0 23.0 0 16.0 0 26.0 27.0 15.0 0 0 27.0 0 21.0 22.0 0 27.0 37.0 0 0 26.0 31.0 0 0 27.0 41.0 0 0 0 25.0 0 19.0 0 0 37.0 22.0 0 25.0 9 10 Biology teacher support material 9 Biology teacher support material 10 Page 35 Investigation 1 (annotated) Investigation 1 (annotated) Seeds watered with De-Ionized water (Trial 2) Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Seeds watered with Smoke water (Trial 2) Did the seed Germinate Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes No Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes Yes Seed Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Height of seedling in mm / ±0.5mm 72.0 73.0 0 72.0 57.0 74.0 79.0 62.0 78.0 64.0 72.0 79.0 72.0 57.0 56.0 83.0 63.0 0 72.0 63.0 0 58.0 81.0 57.0 62.0 0 74.0 73.0 83.0 58.0 74.0 57.0 63.0 79.0 60.0 74.0 79.0 57.0 86.0 53.0 56.0 67.0 63.0 68.0 54.0 68.0 68.0 0 62.0 72.0 Did the seed Germinate No Yes Yes Yes No No No No Yes No Yes No No Yes Yes No No No Yes Yes Yes No No Yes No Yes Yes No Yes Yes No No Yes Yes No No Yes Yes No No No Yes No Yes No No Yes Yes No No Height of seedling in / mm ±0.5mm 0 26.0 21.0 23.0 0 0 0 0 31.0 0 14.0 0 0 16.0 18.0 0 0 0 26.0 31.0 25.0 0 0 21.0 0 31.0 26.0 0 23.0 36.0 0 0 14.0 23.0 0 0 23.0 27.0 0 0 0 24.0 0 45.0 0 0 42.0 23.0 0 0 11 12 Biology teacher support material 11 Biology teacher support material 12 Page 36 Page 37 1 2 3 4 0123 456 7 8 9 10 11 12 13 PERSONAL ENGAGEMENT The student’s report does not reach a standard described by the descriptors below. • STUDENT DRAFT self-assessment 0 1 2 0 1 2 3 4 5 6 0 1 2 3 4 5 6 • MARK POINTS • The evidence of personal engagement with the exploration is limited with little independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation does not demonstrate personal significance, interest or curiosity. • There is little evidence of personal input and initiative in the designing, implementation or presentation of the investigation. • The evidence of personal engagement with the exploration is clear with significant independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation demonstrates personal significance, interest or curiosity. • There is evidence of personal input and initiative in the designing, implementation or presentation of the investigation. EXPLORATION • The student’s report does not reach a standard described by the descriptors below. • • The topic of the investigation is identified and a research question of some relevance is stated but it is not focused. • The background information provided for the investigation is superficial or of limited relevance and does not aid the understanding of the context of the investigation. • The methodology of the investigation is only appropriate to address the research question to a very limited extent since it takes into consideration few of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of limited awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. • The topic of the investigation is identified and a relevant but not fully focused research question is described. • The background information provided for the investigation is mainly appropriate and relevant and aids the understanding of the context of the investigation. • The methodology of the investigation is mainly appropriate to address the research question but has limitations since it takes into consideration only some of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of some awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. • The topic of the investigation is identified and a relevant and fully focused research question is clearly described. • The background information provided for the investigation is entirely appropriate and relevant and enhances the understanding of the context of the investigation. • The methodology of the investigation is highly appropriate to address the research question because it takes into consideration all, or nearly all, of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of full awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. ANALYSIS • The student’s report does not reach a standard described by the descriptors below. • The report includes insufficient relevant raw data to support a valid conclusion to the research question. • Some basic data processing is carried out but is either too inaccurate or too insufficient to lead to a valid conclusion. • The report shows evidence of little consideration of the impact of measurement uncertainty on the analysis. • The processed data is incorrectly or insufficiently interpreted so that the conclusion is invalid or very incomplete. • The report includes relevant but incomplete quantitative and qualitative raw data that could support a simple or partially valid conclusion to the research question. • Appropriate and sufficient data processing is carried out that could lead to a broadly valid conclusion but there are significant inaccuracies and inconsistencies in the processing. • The report shows evidence of some consideration of the impact of measurement uncertainty on the analysis. • The processed data is interpreted so that a broadly valid but incomplete or limited conclusion to the research question can be deduced. • The report includes sufficient relevant quantitative and qualitative raw data that could support a detailed and valid conclusion to the research question. • Appropriate and sufficient data processing is carried out with the accuracy required to enable a conclusion to the research question to be drawn that is fully consistent with the experimental data. • The report shows evidence of full and appropriate consideration of the impact of measurement uncertainty on the analysis. • The processed data is correctly interpreted so that a completely valid and detailed conclusion to the research question can be deduced. • 5 14 15 16 6 17 18 19 7 20 21 22 23 24 0 1 2 3 4 5 6 0 1 2 3 4 Page 38 EVALUATION • The student’s report does not reach a standard described by the descriptors below. • A conclusion is outlined which is not relevant to the research question or is not supported by the data presented. • The conclusion makes superficial comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are outlined but are restricted to an account of the practical or procedural issues faced. • The student has outlined very few realistic and relevant suggestions for the improvement and extension of the investigation. • A conclusion is described which is relevant to the research question and supported by the data presented. • A conclusion is described which makes some relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are described and provide evidence of some awareness of the methodological issues* involved in establishing the conclusion. • The student has described some realistic and relevant suggestions for the improvement and extension of the investigation. • A detailed conclusion is described and justified which is entirely relevant to the research question and fully supported by the data presented. • A conclusion is correctly described and justified through relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are discussed and provide evidence of a clear understanding of the methodological issues* involved in establishing the conclusion. • The student has discussed realistic and relevant suggestions for the improvement and extension of the investigation. • COMMUNICATION • • The student’s report does not reach a standard described by the descriptors below. • The presentation of the investigation is unclear, making it difficult to understand the focus, process and outcomes. • The report is not well structured and is unclear: the necessary information on focus, process and outcomes is missing or is presented in an incoherent or disorganized way. • The understanding of the focus, process and outcomes of the investigation is obscured by the presence of inappropriate or irrelevant information. • There are many errors in the use of subject-specific terminology and conventions*. • The presentation of the investigation is clear. Any errors do not hamper understanding of the focus, process and outcomes. • The report is well structured and clear: the necessary information on focus, process and outcomes is present and presented in a coherent way. • The report is relevant and concise thereby facilitating a ready understanding of the focus, process and outcomes of the investigation. • The use of subject-specific terminology and conventions is appropriate and correct. Any errors do not hamper understanding. • Page 39 1 2 3 4 0123 456 7 8 9 10 11 12 13 PERSONAL ENGAGEMENT The student’s report does not reach a standard described by the descriptors below. • STUDENT DRAFT peer review 0 1 2 0 1 2 3 4 5 6 0 1 2 3 4 5 6 • MARK POINTS • The evidence of personal engagement with the exploration is limited with little independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation does not demonstrate personal significance, interest or curiosity. • There is little evidence of personal input and initiative in the designing, implementation or presentation of the investigation. • The evidence of personal engagement with the exploration is clear with significant independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation demonstrates personal significance, interest or curiosity. • There is evidence of personal input and initiative in the designing, implementation or presentation of the investigation. EXPLORATION • The student’s report does not reach a standard described by the descriptors below. • • The topic of the investigation is identified and a research question of some relevance is stated but it is not focused. • The background information provided for the investigation is superficial or of limited relevance and does not aid the understanding of the context of the investigation. • The methodology of the investigation is only appropriate to address the research question to a very limited extent since it takes into consideration few of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of limited awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. • The topic of the investigation is identified and a relevant but not fully focused research question is described. • The background information provided for the investigation is mainly appropriate and relevant and aids the understanding of the context of the investigation. • The methodology of the investigation is mainly appropriate to address the research question but has limitations since it takes into consideration only some of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of some awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. • The topic of the investigation is identified and a relevant and fully focused research question is clearly described. • The background information provided for the investigation is entirely appropriate and relevant and enhances the understanding of the context of the investigation. • The methodology of the investigation is highly appropriate to address the research question because it takes into consideration all, or nearly all, of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of full awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. ANALYSIS • The student’s report does not reach a standard described by the descriptors below. • The report includes insufficient relevant raw data to support a valid conclusion to the research question. • Some basic data processing is carried out but is either too inaccurate or too insufficient to lead to a valid conclusion. • The report shows evidence of little consideration of the impact of measurement uncertainty on the analysis. • The processed data is incorrectly or insufficiently interpreted so that the conclusion is invalid or very incomplete. • The report includes relevant but incomplete quantitative and qualitative raw data that could support a simple or partially valid conclusion to the research question. • Appropriate and sufficient data processing is carried out that could lead to a broadly valid conclusion but there are significant inaccuracies and inconsistencies in the processing. • The report shows evidence of some consideration of the impact of measurement uncertainty on the analysis. • The processed data is interpreted so that a broadly valid but incomplete or limited conclusion to the research question can be deduced. • The report includes sufficient relevant quantitative and qualitative raw data that could support a detailed and valid conclusion to the research question. • Appropriate and sufficient data processing is carried out with the accuracy required to enable a conclusion to the research question to be drawn that is fully consistent with the experimental data. • The report shows evidence of full and appropriate consideration of the impact of measurement uncertainty on the analysis. • The processed data is correctly interpreted so that a completely valid and detailed conclusion to the research question can be deduced. • 5 14 15 16 6 17 18 19 7 20 21 22 23 24 0 1 2 3 4 5 6 0 1 2 3 4 Page 40 EVALUATION • The student’s report does not reach a standard described by the descriptors below. • A conclusion is outlined which is not relevant to the research question or is not supported by the data presented. • The conclusion makes superficial comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are outlined but are restricted to an account of the practical or procedural issues faced. • The student has outlined very few realistic and relevant suggestions for the improvement and extension of the investigation. • A conclusion is described which is relevant to the research question and supported by the data presented. • A conclusion is described which makes some relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are described and provide evidence of some awareness of the methodological issues* involved in establishing the conclusion. • The student has described some realistic and relevant suggestions for the improvement and extension of the investigation. • A detailed conclusion is described and justified which is entirely relevant to the research question and fully supported by the data presented. • A conclusion is correctly described and justified through relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are discussed and provide evidence of a clear understanding of the methodological issues* involved in establishing the conclusion. • The student has discussed realistic and relevant suggestions for the improvement and extension of the investigation. • COMMUNICATION • • The student’s report does not reach a standard described by the descriptors below. • The presentation of the investigation is unclear, making it difficult to understand the focus, process and outcomes. • The report is not well structured and is unclear: the necessary information on focus, process and outcomes is missing or is presented in an incoherent or disorganized way. • The understanding of the focus, process and outcomes of the investigation is obscured by the presence of inappropriate or irrelevant information. • There are many errors in the use of subject-specific terminology and conventions*. • The presentation of the investigation is clear. Any errors do not hamper understanding of the focus, process and outcomes. • The report is well structured and clear: the necessary information on focus, process and outcomes is present and presented in a coherent way. • The report is relevant and concise thereby facilitating a ready understanding of the focus, process and outcomes of the investigation. • The use of subject-specific terminology and conventions is appropriate and correct. Any errors do not hamper understanding. • Page 41 1 2 3 4 0123 456 7 8 9 10 11 12 13 PERSONAL ENGAGEMENT The student’s report does not reach a standard described by the descriptors below. • STUDENT MARK DRAFT post-peer review self-assmt. POINTS 0 1 2 0 1 2 3 4 5 6 0 1 2 3 4 5 6 • • The evidence of personal engagement with the exploration is limited with little independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation does not demonstrate personal significance, interest or curiosity. • There is little evidence of personal input and initiative in the designing, implementation or presentation of the investigation. • The evidence of personal engagement with the exploration is clear with significant independent thinking, initiative or creativity. • The justification given for choosing the research question and/or the topic under investigation demonstrates personal significance, interest or curiosity. • There is evidence of personal input and initiative in the designing, implementation or presentation of the investigation. EXPLORATION • The student’s report does not reach a standard described by the descriptors below. • • The topic of the investigation is identified and a research question of some relevance is stated but it is not focused. • The background information provided for the investigation is superficial or of limited relevance and does not aid the understanding of the context of the investigation. • The methodology of the investigation is only appropriate to address the research question to a very limited extent since it takes into consideration few of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of limited awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. • The topic of the investigation is identified and a relevant but not fully focused research question is described. • The background information provided for the investigation is mainly appropriate and relevant and aids the understanding of the context of the investigation. • The methodology of the investigation is mainly appropriate to address the research question but has limitations since it takes into consideration only some of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of some awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. • The topic of the investigation is identified and a relevant and fully focused research question is clearly described. • The background information provided for the investigation is entirely appropriate and relevant and enhances the understanding of the context of the investigation. • The methodology of the investigation is highly appropriate to address the research question because it takes into consideration all, or nearly all, of the significant factors that may influence the relevance, reliability and sufficiency of the collected data. • The report shows evidence of full awareness of the significant safety, ethical or environmental issues that are relevant to the methodology of the investigation*. ANALYSIS • The student’s report does not reach a standard described by the descriptors below. • The report includes insufficient relevant raw data to support a valid conclusion to the research question. • Some basic data processing is carried out but is either too inaccurate or too insufficient to lead to a valid conclusion. • The report shows evidence of little consideration of the impact of measurement uncertainty on the analysis. • The processed data is incorrectly or insufficiently interpreted so that the conclusion is invalid or very incomplete. • The report includes relevant but incomplete quantitative and qualitative raw data that could support a simple or partially valid conclusion to the research question. • Appropriate and sufficient data processing is carried out that could lead to a broadly valid conclusion but there are significant inaccuracies and inconsistencies in the processing. • The report shows evidence of some consideration of the impact of measurement uncertainty on the analysis. • The processed data is interpreted so that a broadly valid but incomplete or limited conclusion to the research question can be deduced. • The report includes sufficient relevant quantitative and qualitative raw data that could support a detailed and valid conclusion to the research question. • Appropriate and sufficient data processing is carried out with the accuracy required to enable a conclusion to the research question to be drawn that is fully consistent with the experimental data. • The report shows evidence of full and appropriate consideration of the impact of measurement uncertainty on the analysis. • The processed data is correctly interpreted so that a completely valid and detailed conclusion to the research question can be deduced. • 5 14 15 16 6 17 18 19 7 20 21 22 23 24 0 1 2 3 4 5 6 0 1 2 3 4 Page 42 EVALUATION • The student’s report does not reach a standard described by the descriptors below. • A conclusion is outlined which is not relevant to the research question or is not supported by the data presented. • The conclusion makes superficial comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are outlined but are restricted to an account of the practical or procedural issues faced. • The student has outlined very few realistic and relevant suggestions for the improvement and extension of the investigation. • A conclusion is described which is relevant to the research question and supported by the data presented. • A conclusion is described which makes some relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are described and provide evidence of some awareness of the methodological issues* involved in establishing the conclusion. • The student has described some realistic and relevant suggestions for the improvement and extension of the investigation. • A detailed conclusion is described and justified which is entirely relevant to the research question and fully supported by the data presented. • A conclusion is correctly described and justified through relevant comparison to the accepted scientific context. • Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are discussed and provide evidence of a clear understanding of the methodological issues* involved in establishing the conclusion. • The student has discussed realistic and relevant suggestions for the improvement and extension of the investigation. • COMMUNICATION • • The student’s report does not reach a standard described by the descriptors below. • The presentation of the investigation is unclear, making it difficult to understand the focus, process and outcomes. • The report is not well structured and is unclear: the necessary information on focus, process and outcomes is missing or is presented in an incoherent or disorganized way. • The understanding of the focus, process and outcomes of the investigation is obscured by the presence of inappropriate or irrelevant information. • There are many errors in the use of subject-specific terminology and conventions*. • The presentation of the investigation is clear. Any errors do not hamper understanding of the focus, process and outcomes. • The report is well structured and clear: the necessary information on focus, process and outcomes is present and presented in a coherent way. • The report is relevant and concise thereby facilitating a ready understanding of the focus, process and outcomes of the investigation. • The use of subject-specific terminology and conventions is appropriate and correct. Any errors do not hamper understanding. • Page 43 Page 44 Page 45 Page 46 Page 47 Page 48 Page 49 Page 50 Page 51 Page 52 Page 53 Page 54 Page 55 Page 56 Page 57 Page 58 IB STATISTICS HANDBOOK Why Statistics? ...................................................................................4 What should you know? ....................................................................4 IB STATISTICS Types of Data .....................................................................................5 Categorical data..................................................................................5 Ordinal Data .......................................................................................5 Numerical data ...................................................................................5 HANDBOOK Frequency ...........................................................................................5 Other Data ..........................................................................................5 Sampling Techniques ..........................................................................6 Mag Karl Schauer BSc Random sampling ................................................................................6 Systematic Sampling ............................................................................6 Stratified Sampling ...............................................................................6 Descriptive statistics ...........................................................................7 Averages ...................................................................................................7 Mean ..................................................................................................7 Median ...............................................................................................7 Mode ..................................................................................................7 Measures of central Tendency .....................................................................8 Standard deviation ..............................................................................8 Minimum, Maximum and Range ...........................................................8 Quartiles and Interquartile Ranges ........................................................8 Histograms .................................................................................................9 xkcd.com The normal curve ......................................................................................10 Data tables .......................................................................................11 Title ...................................................................................................11 Labels ................................................................................................11 Data ..................................................................................................11 Summary statistics ..............................................................................11 Formatting .........................................................................................11 Graphical techniques ........................................................................13 Formatting and Labelling ...........................................................................13 2 Page 59 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK Bar graphs ...............................................................................................13 Line graphs ..............................................................................................14 Scatter plots and correlation ......................................................................14 WHY STATISTICS? Correlation ........................................................................................15 Extrapolation and Interpolation ...........................................................17 An academic investigation is a way to try to answer a question. This question must be defined, and a method determined to collect appropriate data. Predictions are then made based on the knowledge gained by answering previous questions in previous investigations. So where do statistics come in? Statistics are the tool you need to boil down all of your carefully collected data into a clear answer. Importantly, they also tell you how sure you can be of that answer. Other graphs ............................................................................................19 Hypothesis Testing ...........................................................................20 Testing for differences ...............................................................................21 The T-test ...........................................................................................21 ANOVA ............................................................................................24 W H AT S H O U L D YO U KNOW? The Mann-Whitney U test....................................................................24 Testing for Correlation ..............................................................................25 The Pearson correlation test ................................................................25 Spearman Rank correlation test ..........................................................26 In order to complete your internally assessed work or data-based extended essays in the IB, you will need to apply some basic statistics. You will need to summarise and describe your data using descriptive statistics like averages and standard deviations. Then you will need to present your data in tables and graphs. Finally, depending on the investigation, you may need to perform a hypothesis test or other calculations to definitively answer your question. You won’t generally be expected to do the sometimes complicated calculations by hand. Tools like spreadsheet software (Excel, LibreOffice etc.) or your Tinspire handheld make many calculations a trivial matter of entering numbers. You can and will likely need to look up tutorials for using your software online, as there are many different softwares and platforms, but there are many tutorials readily available (see this list of resources for help). What is trickier, and mostly up to you, is deciding what statistics to apply in what circumstances, and understanding what those calculations tell you. This handbook should help you with those decisions and that understanding. Other statistical tests .................................................................................27 The chi-squared test ............................................................................27 Nearest neighbour analysis ................................................................27 Critical Value tables ..........................................................................30 How to use spreadsheet software ...................................................31 More Help and further reading ........................................................32 This handbook is intended to be used digitally, and contains some cross-referencing and external links. Underlined text, as well as the table of contents can be clicked to take you where you want to go. 3 4 Page 60 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK T YPES OF DATA SAMPLING TECHNIQUES You might encounter all types of data in your investigations. It is important to distinguish between a few different types of data because not all statistical techniques work with all types of data. Since you will never have enough time or resources to measure all of the possible data points in the population (and if you use statistics, you shouldn’t need to), you will always only measure a small portion of all of the possible points called a sample. But which data points should go into the sample? In order to have a fair test, it is important that each possible data point is equally likely to be chosen for the sample. That is to say, there should be no sampling bias. In order to do this, you will need to use a sampling strategy that fits your investigation. C ATEGORIC AL DATA This type of data fits into defined categories. For example: red, green and blue as options for people’s favourite colour are categories. RANDOM SAMPLING ORDINAL DATA Just to be clear, it is not sufficient to claim that a sample is random if you have simply chosen ‘at random’ places to measure. You might have a subconscious bias for certain measurements. To be truly random, you will need to assign a number to each possible data point, and use a random number generator to tell you which measurements to collect. This is similar to categorical data, but there is a clear order of the groups. For example: low, medium and high income categories. These categories don’t always or necessarily have the same distances between them. NUMERIC AL DATA This type of data includes measurements of all kinds. There is a clear order, as in ordinal data, but the distances between data points are clearly defined. Length, mass and speed are all numerical data. One way to do this is with the RAND() function in spreadsheet software. Simply enter ‘=RAND()’ in a cell, and it will show you a random number between 0 and 1. You can then multiply this number by whatever you need to in order to have a random number between 0 and that number. For example: if you wanted a random number between 0 and 100, you could simply type ‘=RAND()*100’ into a cell. FREQUENCY SYSTEMATIC SAMPLING When a statistician uses the word frequency, they generally mean a count of the number of things. ‘What is the frequency of…’ can usually be translated to ‘ how many…’ In this technique, you simply choose to sample at regular intervals. For example, you might choose to make a measurement every ten meters on a transect. OTHER DATA STRATIFIED SAMPLING This list is certainly not complete. There are other specialty types of data that you might encounter, but these should be sufficient for most of your investigations. This more complicated sampling method is only used if the population is made up of different sub-sets that make up different proportions of the whole. It might be important to make sure that one sub-set isn’t being over- or under-represented in the data. This is most commonly used with survey data. 5 6 Page 61 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK MEASURES OF CENTRAL TENDENCY DESCRIPTIVE STATISTICS Once you have collected your data, you will need to boil it down. Descriptive statistics, sometimes called summary statistics, do just that: they help your reader see the general trends and patterns in your data. Just knowing the average of a group only tells part of the story. Another important aspect is the spread in the data. How similar are the data points to each other? Are they all basically the same, or are there wild differences? These measurements all describe how spread out the data is around the mean. AV E R AG E S STANDARD DEVIATION MEAN Standard deviation is the average of the differences between each data point and the mean. A large standard deviation, relative to the size of the measurement, means that the data is very spread out. This means that there are generally large differences between data points. A small standard deviation, relative to the size of the measurements, indicates that the measurements are close together, or that they ‘agree’. This is generally what is meant when someone says ‘average’. It is the sum of the values divided by the number of values. This is the most common way to summarise sample data. MEDIAN This is the ‘middle’ data point. There are just as many data points higher and lower than this one in the sample. This measure of average is more likely to be used if the sample is distributed in a strange way, or if outliers might strongly affect the mean. For example, in a sample of measuring personal wealth, one or two billionaires in the sample might heavily skew the mean to show an average income that is higher than almost every single individual. In this case it would be appropriate to use median income to better represent the sample. Standard deviations are useful for numerical data. If the data is not numerical, or is not normal, you will need a different way to show the spread of the data. MINIMUM, MAXIMUM AND RANGE To give a basic idea of the spread of your data it is often good to include the total range of the data, that is to point out the highest and lowest measured values (maximum and minimum), and the distance between them (range). MODE QUARTILES AND INTERQUARTILE RANGES Mode is much less commonly used. It is the ‘most common’ data point. Or in other words, the data point with the highest frequency. These measures of spread apply to the median in a similar way to how the standard deviation applies to the mean. Here is how they are calculated: Example given the five values in the sample: 1. Arrange the data in rank order and divided into four equal parts, each containing an equal number of values. Each section is called a quartile. The quartile containing the highest values is the upper quartile, while the one with the lowest values is the lower quartile. 2. The Upper Quartile Value (UQV) or Q1 is the mean of the lowest value in the upper quartile and the highest value in the quartile below it. 3. The Lower Quartile Value (LQV) or Q3 is the mean of the highest value in the lower quartile and the lowest value in the quartile above it. 4. The Inter-Quartile Range (IQR) is the difference between both values calculated in #2 and #3. A high IQR means the data is very dispersed, while a low IQR means the data is less dispersed. 1 , 3, 3, 5, 8, 11, 12 Mean 43/7 Median Mode 1, 3, 3, 5, 8, 11, 12 1, 3, 3, 5, 8, 11, 12 6.1 5 3 average middle most common 7 8 Page 62 IB STATISTICS HANDBOOK For example: IB STATISTICS HANDBOOK THE NORMAL CURVE 1 2 3 4 5 6 7 8 9 10 11 Q3 Q2 Many data just happen to fit to a normal distribution curve, also called a bell curve or, more technically, a gaussian curve. If your data fits this type of distribution, you can make some predictions using your data and the mathematics behind this curve. Q1 IQR = Q1- Q3=8.5-3.5 = 5 For a more detailed explanation, visit this site: https://stattrek.com/statistics/dictionary.aspx?definition=interquartile%20range A histogram is a way to visualise data in a sample. It is essentially a bar chart with categories for the measured values (for example 1-10, 11-20, 21-30) on the x-axis and frequency (the number of data points in that category) on the y-axis. A histogram can be useful to show if your data are normally distributed, that is they generally look like a bell curve. (See section on the normal curve) It may be important to show this before you can use some types of hypothesis tests. Frequency HISTOGRAMS Your histogram might not show a normal curve. The data might be skewed to one direction, or even show several peaks. These might be important aspects to bring up in your evaluation, and might help you choose what hypothesis test, if any, you can use. Normal Right-skewed The standard, ‘bell’, or gaussian curve shows the pattern of how normally distributed data spreads around the mean. If your histogram looks like this, your data is probably normally distributed. Age The shaded areas under the curve represent the proportion of data points that will likely be found in this section of the curve. The x-axis shows standard deviation distances with the mean at 0. The area under the normal curve shows how many data points are likely to be found in any given range. 68% of data points will, on average, be within one standard deviation of the mean, and 95% will fall within 2 standard deviations. This is helpful in predicting probabilities, and this type of math is the basis for hypothesis tests. This histogram shows the distribution of age in a sample. It appears to be slightly right-skewed. The normal distribution curve is one of many used in statistics, but it is the most common shape you will likely encounter. Left-skewed If your data appear to be normally distributed; that is, your histogram appears to have a normal curve shape, then you may be able to use some hypothesis tests that require normal data as a prerequisite. These are some shapes of histograms you might encounter. Here is a deeper explanation of histograms and how they are made by hand: https://youtu.be/4eLJGG2Ad30 Here is a longer explanation of the different shapes you might encounter in histograms: https://youtu.be/Y53_8WRrPzg 9 10 Page 63 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK Here is an example of a well-organised table: DATA TABLES Once you have boiled your data down into some tangible values, you will need to present the raw data and your descriptive statistics in well-organised tables. Designing data tables is an art form all its own. A few points might help you make yours beautiful. Table 1: The height of 15 Z. mays plants after growing for 30 days at different fertiliser concentrations in three different field sites. Plant height 30 days after germination (+/- 1mm) Concentration of fertiliser in TITLE Be sure that each table has a meaningful and descriptive title (not just ‘table 1’). With multiple tables, it is usually a good idea to number them (hint: check that your numbers are right before you hand in a draft!) so that you can refer to them easily in your text. LABELS Your data columns need proper labels including: Standard Field site 1 Field site 2 Field site 3 Average 0.10 345 330 404 360 0.20 442 410 430 427 16 0.30 510 470 550 510 40 0.40 580 530 603 571 37 0.50 200 130 240 190 56 soil (+/- 0.10 deviation mg/kg) 39 • A clear descriptive title of what data is listed in the column, • The appropriate units of those values, and • The measurement precision of those values. DATA The data itself should have the correct number of significant figures to reflect the precision of the data (see these links for help with significant figures). Be careful not to show more precision (more significant figures) in an average. You will likely need to format the cells of your table to show the appropriate number of digits, since the trailing zeros disappear otherwise. If you have very large or very small values, simply use scientific notation. SUMMARY STATISTICS You may want to include your averages and standard deviations right in the table with your raw data. If you have a lot of data, or it is relatively complex, you might want to create a separate data table of your summary statistics. You should use whatever you think will help your reader see the data best. FORMATTING It is usually a good idea, if possible, to present your data table on one page. Having the first half of a table at the end of one page, with the last half continuing on the next makes it very hard to get an overview of the data. Also, try to size your columns carefully to fit them on the page, but not to muddle the titles. 11 12 Page 64 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK Once you have presented your data in tables, you will need to make it more readily visible to your reader. It is important to choose the right graph for the type of data you are presenting. The formatting and labelling of the graph is also important. If done well, graphs should show the reader the answer to your research question at a glance. FORMATTING AND L ABELLING Generally speaking, the same rules for tables apply to graphs. Be sure that each graph has a clear and descriptive title, that the axes are labelled in the same way that the columns of the corresponding tables are labelled. Make sure that the axes are scaled so that the data fill the graph, and that the scale numbers reflect the same level of precision as the data. Also, make sure that the independent variable (that you changed or defined on purpose) is on the x-axis, and that the y-axis shows your dependent variable (measured result). It is almost always best to graph your processed data, and not the raw data, unless there is something important you want to show the reader about your raw data. BAR GRAPHS Bar graphs are used to represent numerical data (y-axis) from different categories (x-axis). Bar graphs of averages should have error bars showing standard deviation, or some other measure of spread. Somewhere on the graph or in its caption you need to declare what the error bars represent. Be sure that you are using the unique standard deviation values that you calculated in your tables, and not the automatic values that some softwares apply (incorrectly). LINE GRAPHS Line graphs and scatter plots are often confused for each other. Line graphs show straight lines connecting the dots of the data points. This is to represent the fact that line graphs show multiple measurements of the same thing. The straight line is an assumed linear change of that measured value between measurements. Therefore, only use a line graph if you are tracking the change of something. Figure 2: Global average temperature from 1880 - 2000 14.50 14.42 Average global temperaure (+/-0.01°C) GRAPHICAL TECHNIQUES 14.34 14.26 14.18 14.10 14.02 13.94 13.86 13.78 13.70 Figure 1: The average growth of cress seeds after growing for 4 days under different coloured light. The error bars represent one standard deviation. 1870 1890 1910 1930 1950 1970 1990 2010 Year Average growth of cress plants after 4 days (+/-1mm) 40 SC ATTER PLOTS AND CORREL ATION Scatter plots are used to compare numerical values on both axes. If both your independent and dependent variables are numerical measurements, this is probably the type of graph you should use. Each dot on the graph represents a data point, and this can show trends in the data. Usually this type of investigation is looking for some sort of relationship between the two variables. You will need to start with this type of graph to look for correlations, or in order to perform interpolations or extrapolations. 30 20 10 0 If you graph average values, you will need error bars to show the spread of the data (see bar graphs). Be sure, however, to use all data points, not just averages, to calculate an R2 value. Red Blue Green Yellow Orange Color of light applied to growing plants 13 14 Page 65 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK Figure 1: This graph shows a strong positive correlation between the height and age of the sampled trees. 32 y = 0.7334x - 2.8792 The line of best fit or trend line is chosen by the computer to be as close as possible to all of the data points. It is an approximation of the linear trend in the data. The closer all of the data points are to the trend line, the stronger the correlation. R² = 0.9189 The degree of correlation is measured by the correlation coefficient, r or R, more technically called the ‘Pearson product-moment correlation coefficient’. This value ranges from -1 for a set of data that aligns perfectly to a line with a negative slope, to 1, for a set of data points that align perfectly to a line with a positive slope. The closer the r value is to either -1 or 1, the stronger the relationship between the two variables. Height of tree (+/-2m) 25.6 19.2 12.8 r= 1 6.4 0 0 10 20 30 40 50 Age of tree (+/-1year) CORREL ATION A correlation is a relationship between two numerical variables. A correlation can be positive or negative: • Positive correlation: As the independent variable increases, the dependent variable also increases. • Negative correlation: As the independent variable increases the dependent variable decreases. Positive correlation Negative correlation No correlation 15 0.8 0.3 0 -0.3 -0.8 -1 Alternatively, correlation can be reported as the coefficient of determination, r2 or R2. This is simply the correlation coefficient squared, which therefore always has a positive value between 0 and 1. This value is defined as the proportion of the variance in the dependent variabel that can be predicted by the independent variable. For example, with an r2 value of 1, all of the data points align perfectly. That means that the value of the dependent variable can be predicted with 100% precision for any value of the independent variable. For an r2 value of 0.8, 80% of the variance in the dependent variable is predicted by the independent variable, and therefore values of the dependent variable can be predicted with 80% precision given any independent variable. Predicting the values of variables based on a correlation is called extrapolation or interpolation. In order to make claims about a linear correlation, it is important that the data show a linear trend. If the data are not linear, or not expected to be linear, it is not appropriate to compare them to a trend line! Here are some examples of when linear regression is not appropriate: 16 Page 66 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK • Enzyme activity is expected to increase logarithmically as temperature increases, then peak at the optimum temperature for that enzyme, then drop sharply as the enzyme denatures at higher temperatures. A graph comparing temperature and enzyme activity might therefore look something like this: Enzyme activity 50 Interpolation is using the trend line to predict values within the range of your data. Extrapolation is expanding the trend line beyond the data to make predictions outside of the range of data. The further the predicted value is from the measured values, the less reliable the extrapolated value will be. 40 This shape of a graph should not be compared to a line. 30 20 10 0 0 20 40 60 80 Temperature (+/-1°C) • The rate of a chemical reaction slows over time as substrate is used up. The shape of the curve produced is predictable and depends on the type of reaction. The graph of the concentration of the product over time might look something like this: 40 This shape of a graph should not be compared to a line. Instead, a Spearman Rank test can be performed. 30 20 An example of interpolation is determining the osmolarity of a tissue. Suppose you measured the rate of osmosis in potato tissue in various concentrations of sugar solution. Your data were linear and looked like this: 10 0 3.00 0 30 60 90 120 Time (+/-1s) When interpreting r or r2 values, be sure to be realistic about the strength of the correlation. An r2 value of 0.3 may or may not indicate any kind of relevant relationship. If you want more certainty about whether your correlation is statistically significant, you should consider using a Pearson’s R test for correlation or a Spearman’s Rank test. You can read more about these tests in the section on hypothesis testing. EXTRAPOL ATION AND INTERPOL ATION If you have determined a strong linear relationship in your data, you can use this data to make predictions. You can use the equation of the line of best fit to calculate the expected value for an unknown. 17 Change in mass of potato tissue after 1h (+/-0.01g) Concentration of product (+/- 0.1M) 50 An example of extrapolation is using current trends in climate change to make predictions about how the planet’s climate will continue to change in the future. This is how climatologists make predictions about how warm the earth might be in the coming decades. y = -7.1143x + 2.073 R² = 0.9659 You must use all of the data points (not just the averages!) in order for your software to accurately calculate R2. 2.00 1.00 0.00 -1.00 0.29 -2.00 0.00 0.10 0.20 0.30 0.40 0.50 Concentration of sucrose solution (+/-0.01M) 18 Page 67 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK The high R2 value of 0.97 shows that the data have a strong linear correlation. The negative slope value of -7.1143 shows that the relationship is a negative correlation. Because the trend is very strong, the equation for the line can be used to make predictions. You were asked to determine what concentration would be isotonic to the potato tissue, that is, at what concentration no net osmosis would occur. At this concentration the change in the mass of the potato tissue would be zero. To find the corresponding concentration, you can substitute zero for y (the change in mass) in the trend line equation, and solve for x (the concentration): HYPOTHESIS TESTING y = − 7.1143x + 2.073 → 0 = − 7.1143x + 2.073 −2.073 = − 7.1143x → −2.073 =x −7.1143 x = 0.2914 This value of 0.29 M is where the trend line crosses the x axis, and it is the osmolarity, or isotonic concentration of the potato tissue. It must be rounded to reflect the precision of the measurements that were used to calculate it. This process could be repeated to predict any given change in mass or any concentration within the range of the data. OTHER GRAPHS Though the graphs listed above are the most likely you will need, there are of course many other types of graphs. Here are two others that you might consider: Pie charts show the breakdown of a group into its parts, usually percentages. The percentages should add up to 100. Avoid too many categories, as the chart can quickly become difficult to read. Radar charts can be used to show many different attributes at once, and compare these between locations or individuals. The goal of an experiment or investigation is to answer a specific question. The data should make it clear what the answer to that question is. Often, due to the uncertainty inherent in data, the answer may not be entirely clear. It may look like there is a difference between two groups, but the difference might only be due to chance. It may appear that there is a correlation between two variables, but the sample may have been a fluke. Hypothesis testing allows you to determine how sure you are of the answer, and the likelihood of the observed pattern being due to chance. A hypothesis test requires that you make an assumption, and calculate the probability of this assumption being true. This assumption is called the null hypothesis, H0. If this null hypothesis assumption can be shown to be very unlikely, then you can conclude instead that the alternative hypothesis, HA, is true. Despite the naming, these hypotheses are different than your experimental hypothesis, that is, your reasoning about what you think will happen in your experiment. You always need to declare and explain an experimental hypothesis in the exploration portion of your work. You only need to declare null and alternative hypotheses in the context of your hypothesis test, if you choose to use one. This should be included in your explanation of the data analysis. A hypothesis test generates a test statistic. The value of this test statistic gives you information about the likelihood of the null hypothesis being true. That test statistic can then be compared to table of critical values that it must be higher or lower than in order to conclude a statistically significant result. Usually this process is simplified, and a p-value can be calculated based on the test statistic. The p of p-value stands for probability, and it is the probability of the null hypothesis being true given your data. It is always a value between 0 and 1 (i.e. 0 and 100%) . If the p-value is low enough, then it is very unlikely that the null hypothesis is true, and it can safely be rejected. When the null hypothesis is rejected, the alternative hypothesis can be concluded, and there is a statistically significant result. The p-value is compared to the alpha value. For our intents and purposes you will use an alpha value of 0.05. This is the threshold probability below which you determine that the null hypothesis is too unlikely. That is, if the probability of the null hypothesis being true (p-value) is less than 5%, then you should conclude that it is too unlikely to be reasonable and reject the null hypothesis. For an example of this process, read the section on ttesting. If the p-value is above the alpha threshold of 0.05, then you must ‘fail to reject the null hypothesis’. This is different than accepting the null hypothesis! You don’t have enough 19 20 Page 68 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK evidence to conclude that the null hypothesis is true. Instead you simply ‘fail to reject’ and conclude that you cannot be sure whether the observed result is due to random chance or a real effect. For example, if a test gives a p-value of 0.2, there is a 20% chance that the null hypothesis is true given your data. It would not be reasonable to conclude, then, that the null hypothesis is true, as there is only a 20% chance of this being the case. You also cannot rule it out entirely, since 20% is a significantly high likelihood. Therefore you simply ‘fail to reject the null hypothesis’. Suppose you want to find out if dandelions (T. officinale) grow to different heights in two different types of soil. In your experiment you measure the average growth of the dandelions in each of two soil types. Figure 1: The average maximum height of 16 T. officinale plants grown in two different soil types. Error bars represent one standard deviation. Maximum achieved height of T. officinale plants (+/-0.1cm) TESTING FOR DIFFERENCES The t-test is used when the data can be assumed to be normal and the sample sizes are relatively large (more than 10 measurements). It might be a good idea to make a histogram to see if the data appear to be normal, but at the very least you should state that you assume the data to be normally distributed, and why you think it is. If the assumptions for normality are met for the t-test, but you have more than two groups, you will need to perform an ANOVA (analysis of variance) test to see if the variability between the groups is due to chance or some real effect. If it is not safe to assume that the data are normally distributed, you have small samples, or your data are ordinal, but not numerical, then you can make a comparison between two groups using the Mann-Whitney U Test instead. This test is less likely to find a difference if there is one, but it is safer to use if the prerequisites for a t-test are unclear or not met. THE T-TEST The t-test assumes a null hypothesis that there is no significant difference between the groups (any observed difference is due to chance), then calculates the probability of that hypothesis given your data. H0: There is no significant difference between the two groups HA: The observed difference between the groups is statistically significant, and not likely due to chance. 21 17.0 Average maximum height of plants (+/-0.1cm) Often an investigation aims to find differences between groups. The t-test, ANOVA, and the Mann-Whitney U test are different ways to determine whether observed differences are statistically significant, or just due to random chance. 12.8 8.5 4.3 0.0 Soil a Soil b Soil a Soil b 8.0 15.5 6.3 14.7 9.1 14.5 13.2 12.2 12.0 10.1 6.3 15.0 10.0 12.1 11.0 13.2 12.1 16.0 9.8 14.2 8.5 13.1 12.2 9.9 9.7 17.8 10.1 10.3 13.2 16.4 10.3 19.0 Average 10.1 14.0 Standard deviation 2.1 2.7 You notice a difference between the groups. The plants in soil b have grown taller on average than the plants in soil a. Since the error bars are overlapping, it is hard to say whether this observed difference is due to chance, or whether the two are really different. Therefore you decide to perform a t-test to find out. First, you need to determine whether the data are normally distributed, so you make a histogram to see. 22 Page 69 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK 0.05. That means that if the probability of the null hypothesis being true sinks below 0.05 (5%), then it can be rejected and the alternative hypothesis accepted. Figure 2: A histogram of the data shows that it appears to be normally distributed. Soil a In the case of these data the p-value is 0.00008. This is well below the threshold level of 0.05, and therefore the null hypothesis can be rejected. The observed difference between the two soils is not due to random chance, it is a statistically significant difference. Soil b 5 Frequency 4 A N OVA 3 The ANOVA test should be used if you have more than two groups to compare. Though you could theoretically perform many t-tests between each of the possible combinations, this is inefficient, and mathematically risky. Every time you perform a t-test, there is a small probability that your difference was in fact due to a random fluke, and not a real difference. If you perform many t-test, the likelihood of making such an error increases. 2 1 0 6.0-7.9 8.0-9.9 10.0-11.9 12.0-13.9 14.0-15.9 16.0-17.9 18.0-19.9 Height of plants (+/-0.1cm) The ANOVA test assumes the following hypotheses: Because the data appear to be normally distributed, and the sample sizes are sufficiently large (n=16), you can proceed with the t-test. H0: The groups are all the same Using the T.TEST() function of your spreadsheet software, you enter the following information: HA: At least one of the groups appears to be different than the rest. The variability between groups is not likely to be due to chance. The ANOVA test produces a p-value that can be interpreted in the same way as in the ttest. If the p-value is below the threshold of 0.05, then the null hypothesis can be rejected and the alternative hypothesis concluded. It is not clear from the ANOVA test what groups are different from each other, but instead that the variability between the groups is not due to chance. =T.TEST(dataset from soil a, dataset from soil b, 2, 2) This decimal may or may not be required in your software. These are the lists of raw values, not the averages. The two 2’s in the syntax tell the software what kind of ttest to perform. You want a ‘two-tailed’, ‘non-paired’ test. You can learn how to perform the ANOVA test in Excel or LibreOffice here: The T.TEST() function of your spreadsheet software returns the p-value for the test. The pvalue is the the probability of H0 being true given your data. Excel: https://youtu.be/qQSQr_JldyY LibreOffice: https://youtu.be/TxTKq4W8qX8 You can watch a tutorial video on the t-test in Excel here: THE MANN-WHITNEY U TEST https://youtu.be/DPNUpldVC4M As the p-value decreases, the likelihood of the difference being due to random chance also decreases. Eventually, the p-value is so small, that it is no longer reasonable to assume that the null hypothesis is true, and therefore the null hypothesis can be rejected. The most commonly used threshold level (also called the alpha value) for rejecting the p-value is 23 The Mann-Whitney U test is mathematically very different than the t-test, but achieves a similar goal of finding out whether an observed difference is statistically significant. Instead of using the actual values of the measurements for the comparison, it simply compares the rank order of the values. This is somewhat analogous to the difference between a mean and median average with the t-test being similar to the mean and the 24 Page 70 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK Mann-Whitney U similar to the median. This type of hypothesis test that does not rely on the actual value of the measurements is called a non-parametric test. The null and alternative hypotheses are the same as for the t-test: If these conditions are met, you can continue with the test. The null and alternative hypotheses are as follows: H0: There is no correlation in the data. The observed trend is due to random chance. H0: There is no significant difference between the two groups HA: The observed trend is a statistically significant correlation. HA: The observed difference between the groups is statistically significant, and not likely due to chance. Although you can painstakingly calculate the Mann-Whitney U statistic by hand, with some help from your spreadsheet software, this is not a requirement of the IB. Instead simply enter your two sets of data in this online calculator: In your spreadsheet software, enter the formula to calculate r, the Pearson correlation coefficient: =PEARSON(dataset of independent variable, dataset of dependent variable) This formula calculates the r value that then needs to be compared to a critical value table (see critical value tables here). https://www.socscistatistics.com/tests/mannwhitney/default2.aspx The calculator gives you the p-value for the test, which you compare to the alpha threshold of 0.05 as in the t-test. If the p-value is below 0.05, you can reject the null hypothesis and conclude that there is a statistically significant difference. TESTING FOR CORREL ATION If your investigation intends to look for a relationship between two variables, it might be a good idea to test whether the correlation your data suggest is statistically significant or likely to be due to chance. A correlation test does just that. These tests work in a similar way to tests for differences. The two most relevant tests are the Pearson productmoment correlation test also called the Pearson correlation test and the Spearman rank correlation test. THE PEARSON CORREL ATION TEST The critical value table shows what values of r are significant. The strength of the test depends on the number of data points included (n), so the critical value also changes with n. Keep in mind that one data point has two measurements. If you were comparing, for example, height and weight of plants, and measured height and weight of 12 plants, then n would be 12, not 24. You simply need to compare the r value that you calculated to the critical value corresponding to the number of data points you used. If the absolute value of r is greater than the critical value, then the correlation is statistically significant and not likely to be due to chance. For example, if your r value was -0.65 and you had 10 data points, you would compare that r value to the critical value 0.521 from the critical value table and conclude that the absolute value of r is greater than the critical value. Therefore the correlation is statistically significant. SPEARMAN RANK CORREL ATION TEST The Pearson correlation test, similar to the t-test, requires the data to meet some prerequisites. In order to run this test, the following conditions must be true: The Spearman rank correlation is denoted by the symbol ρ (rho) or rs. Analogous to how • The data are numerical for both variables. • The data are paired, that is, there are two measurements or values for each data point, the dependent and independent variables. the Mann-Whitney U test compares rank instead of the actual values of the data, the Spearman rank test determines a correlation in the data by looking at the rank order of the data instead of its actual values. Use this test instead of the Pearson correlation test if any of the following are true: • The data follow a linear trend. • The data are not numerical but are ordinal. • The data can be assumed to be normally distributed for both variables. • The data are not normally distributed. • There are no obvious outliers in the data set. 25 26 Page 71 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK the distance between each location (eg. tree) and it’s nearest neighbour. The nearest neighbour index, (NNI or Rn) is then calculated according to the following formula: • The data do not appear to have a linear trend, but do trend either positively or negatively. • There are apparent outliers in the data. The null and alternative hypotheses are the same as for the Pearson correlation test: NNI = 2D̄ H0: There is no correlation in the data. The observed trend is due to random chance. n A Where D̄ is the average nearest neighbour distance, HA: The observed trend is a statistically significant correlation. n is the number of observations, and To calculate the test statistic, simply enter your data in the calculator at this site, and interpret the p-value as in the other tests: A is the total area studied. The NNI value can indicate that the points are clustered, random, or ordered, depending on its value: https://www.socscistatistics.com/tests/spearman/default2.aspx NNI = 0 : The points are completely clustered OTHER STATISTIC AL TESTS NNI = 1.0 : The points have a completely random distribution Though making comparisons between groups and determining correlations between variables are the most common statistical tests, there are many other ways to test data for significance. The chi-squared test for goodness of fit and the nearest neighbour analysis are two that you might need for biology and geography respectively. NNI = 2.15 : The points are distributed uniformly. 0 1.0 2.15 clustered random uniform THE C HI-SQUARED TEST This test determines whether data fit a pattern or model. The data for the test needs to be categorical. One of the simplest versions of this test is to test for an association between species, that is whether the location of some species of immobile organism associates with the location of another species. A chi-squared test can also be used in genetics to determine if the frequency of genotypes matches the expected ratios. For more information about this type of test and how to calculate and interpret the chi-squared value, refer to your oxford biology textbook on pages 215 (association between species) and 453 (genotype ratios). Here is a demonstration of how you can use Excel to calculate chi-squared: https://youtu.be/o0VhMWeotFg N E A R E S T N E I G H B O U R A N A LY S I S In geography, the nearest neighbour analysis can be used to determine if the spacing between points is random, clustered, or ordered. First, the data is collected by measuring Given the number of data points, you can compare the NNI value to the critical value table. If NNI is below the number for clustered points, then there is statistically significant clustering. If it is above the value for uniformity, then the points are statistically significantly uniform. If it lies between the two values, the Tree Nearest Distance points are randomly distributed. 1 2 1.1 2 1 1.1 3 2 1.3 4 7 0.4 5 3 1.2 6 7 1.0 7 4 0.4 8 9 2.0 8 2.0 9 D̄ = For example, the following data was collected by measuring the distances between trees in a 36m2 park: 1+ 2+ 4+ 7+ 5+ 6+ 1.2 n= 9 27 8+ 3+ 9+ A = 36m2 28 Page 72 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK C R I T I C A L VA L U E TA B L E S Since n= 9, A=36m2, and D̄ = 1.17m The NNI value is therefore calculated as: NNI = 2D̄ n = 2 ⋅ 1.2 A Pearson r 9 = 1.2 36 For n= 9, the critical value table gives 0.713 as the limit below which the points would be considered clustered, and 1.287 as the upper limit, above which the data would be considered ordered. You can therefore conclude that the trees are randomly dispersed. 29 n Critical value 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 26 31 36 41 46 51 61 71 81 91 101 0.988 0.900 0.805 0.729 0.669 0.622 0.582 0.549 0.521 0.497 0.476 0.458 0.441 0.426 0.412 0.400 0.389 0.378 0.369 0.360 0.323 0.296 0.275 0.257 0.243 0.231 0.211 0.195 0.183 0.173 0.164 Nearest neighbour index Critical values n clustered uniform 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 50 60 70 80 90 100 0.392 0.504 0.570 0.616 0.649 0.675 0.696 0.713 0.728 0.741 0.752 0.762 0.770 0.778 0.785 0.792 0.797 0.803 0.808 0.812 0.817 0.821 0.825 0.828 0.831 0.835 0.838 0.840 0.843 0.846 0.848 0.850 0.853 0.855 0.857 0.859 0.861 0.862 0.864 0.866 0.867 0.869 0.870 0.872 0.878 0.889 0.897 0.904 0.909 0.914 1.608 1.497 1.430 1.385 1.351 1.325 1.304 1.287 1.272 1.259 1.248 1.239 1.230 1.222 1.215 1.209 1.203 1.197 1.192 1.188 1.183 1.179 1.176 1.172 1.169 1.166 1.163 1.160 1.157 1.155 1.152 1.150 1.148 1.145 1.143 1.141 1.140 1.138 1.136 1.134 1.133 1.131 1.130 1.128 1.122 1.111 1.103 1.096 1.091 1.086 30 Page 73 IB STATISTICS HANDBOOK IB STATISTICS HANDBOOK HOW TO USE SPREADSHEET SOFT WARE You will need to spend some time learning how to use your brand of software on your platform, as they all differ somewhat. Excel©(subscription based) and LibreOffice© (freeware) are both good options, but you could also use Numbers© on MacOS, or Google© Sheets, though the latter has some significant limitations. Many of these calculations can also be performed on a Tinspire© handheld. Searching the web, or using your software’s help function will usually yield quick answers to tricky problems. Here are some tips and resources that might help you on your way: Tip: Be sure you know whether your software expects a decimal ( . ) or a comma ( , ) as a separator. If you use the wrong one, the computer does not recognise your data as numbers, but instead treats it as text which causes all calculations to fail. Use the 'search and replace’ function of your software to change all of them at once. Tip: Use this site to find the appropriate function in your software’s language: http://www.excelfunctions.eu MORE HELP AND FURTHER READING For help reviewing how to use significant figures appropriately, watch these videos: An introduction to significant figures: https://youtu.be/eCJ76hz7jPM Rules to determine significant figures: https://youtu.be/eMl2z3ezlrQ For lots of in-depth information on the geography Internal Assessment, visit these pages: https://www.thinkib.net/geography/page/22606/ia-student-guide https://sites.google.com/site/geographyfais/fieldwork For more help with biological statistics for the IA, visit this site: The Moodle site 7AB Tabellenkalkulation has guides for performing simple calculations and making diagrams in Excel© and LibreOffice© here: https://moodle.tsn.at/course/view.php?id=36089 https://www.biologyforlife.com/statistics.html For a very useful handbook of basic statistics, look for a copy of this book: Methods of Statistical Analysis of Fieldwork Data. St. John, P. and Richardson, D.A. Geographical association 1996. At Mr. Schauer’s youtube channel you can find a handful of videos on data analysis: bit.ly/mrschauersyoutube Here are some instructions on how to make a histogram in Excel: https://support.office.com/en-us/article/create-a-histogram-85680173-064b-4024b39d-80f17ff2f4e8 For more information on calculating quartiles and the inter-quartile range using excel, visit this site: https://www.statisticshowto.com/probability-and-statistics/interquartile-range/ #IQRExcel 31 32 Page 74 Using statistics in IB biology and ESS 2014 Contents Types of questions and types of data Table of Contents Types of questions and types of data 2 Why is this important? 2 Types of questions 2 Types of data 2 The normal distribution 3 Presenting data 3 Which statistical test? How to process your data in Excel Using statistics in IB biology Why is this important? It is important, in the PLANNING stages of an investigation, to think about how you will process the data that you collect. A common mistake is to collect data and then try to decide what you will do with it. This often results in people realising that they do not have enough data, or that the data is the wrong type for the statistical test they would like to perform. Worse still, it sometimes means that people try to do all of the tests they know and then pick the one that was significant, not for a good scientific reason, but simply because it matches their preconceptions. This is VERY bad science! 3 5 Types of questions 5 In biology you are usually (but not always) interested in one of 3 types of question; Mean 5 Standard deviation 5 Is there a significant difference between my sets of data (e.g. Are girls taller than boys? Who are more intelligent – Brown, blue or green eyed people?)? 95 % confidence interval 5 Descriptive statistics (Mean, Standard deviation, 95% confidence interval) Null and alternative hypotheses 5 How significant is the difference in my data? 5 Testing for differences between 2 sets of data 6 T tests 6 One or two tailed? 6 Paired or unpaired? 6 Homoscedastic or heteroscedastic? Is there a relationship between my 2 sets of measurements (e.g. Do taller people have bigger feet? Does increasing light intensity cause an increase in photosynthesis?)? Do the observed frequencies I have counted match the expected theory (e.g. The probability of rolling a 6 on a dice is My sister has rolled 13 sixes in her last 36 rolls. Is the dice ‘dodgy’? )? The type of question you are asking has an effect on the type of type of data you need to collect. This in turn affects the statistics you need to use and the type of graph you will draw. 6 7 Types of data Enabling the Data Analysis Package 7 Carrying out an ANOVA analysis 7 Data can be continuous (i.e. it is measured on a scale where any value on that scale is possible), discrete (i.e. there are particular values that are possible and other values that are not possible) or ranks (i.e. you have data in an order 1st, 2nd, 10th or you have rating scales). You need to display these types of data differently and do different kinds of descriptive statistics. You also need to use different statistical tests. It is possible to convert between the two (e.g. grouping continuous height data into categories ‘short’ and ‘tall’ or ranking people from tallest to shortest) if you need to process the data in a particular way. Testing for differences between more than 2 sets of data Looking for CORRELATIONS between variables. What does it tell us if data are correlated 9 9 Which test? 9 Pearson test for correlation. 9 Spearman’s Rank test for correlation 9 Testing to see if your observed data matches the expected data 10 What type of data? 10 Calculating Chi Squared by hand 10 Calculating Chi‐squared using Excel 11 The independent variable is the one you have deliberately changed to see what effect this has on the dependent variable (the ‘results’ you have measured to see if there has been a change). Control variables are other variables that should be controlled to ensure a fair test. If it is impossible to control them (e.g. in field work) then they should be measured so that their effect on the data can be evaluated. The normal distribution The normal distribution is frequently seen in biology. There is natural variation in many characteristics as a result of genes interacting with the environment. A normally distributed characteristic is one that is symmetrically distributed around the mean with most values being close to the mean and very few values being at the extremes. If a population is normally distributed then 50% of values will be below the mean and 50% above the mean. Also 68 % of values will be within 1 standard deviation of the mean and 95 % will be within 2 standard deviations of the mean. Only continuous data can be normally distributed. Discrete and rank data cannot be normally distributed unless it is ‘transformed’ which is well beyond what is expected in IB biology! Presenting data If the independent variable is discrete (categories) and the dependent variable is continuous then you should plot a bar graph with the mean of the data and Y error bars showing the 95% confidence interval or the standard deviation if you have sufficient repeats, or use the range or if you do not have sufficient repeats. √ If the independent variable is discrete and the dependent variable is a rank or count data you will need to decide if a clear table, a bar graph or pie chart is the most appropriate way to display the data. If both variables are continuous then it should be plotted as a scatter graph with the independent variable on the X axis. X error bars should represent the uncertainty in the measurement and the Y error bars should represent the uncertainty of the mean of the measured data (either 95% confidence interval or standard deviation if you have sufficient repeats, or use the range or where N is the number of values, if you do not have sufficient repeats). √ If you have no repeats or if the uncertainty in the measurement is larger than the variation within the data then you should use the error in the measurement. (e.g. if you were measuring temperature to the nearest degree and your repeat measurements gave you a standard deviation of 0.2 you would use ±1 rather than ±0.2 as it is the larger error. Page 75 Which statistical test? The statistical test you choose depends on the type of question and whether your data is from a normal distribution (parametric) or not (non-parametric). If you are not sure if your data fits a normal distribution it is better to assume it DOES NOT and use the non-normal distribution (nonparametric) tests. Most continuous variables in living things do follow a normal distribution due to the interaction of genes and the environment. The chart on the next page tells you how to decide which type of test you need to use. Although there are a large number of other statistical tests, it is probably best to avoid collecting data or to posing a question that cannot be answered using one of the tests shown here unless you have a real statistical package available to you! In the sections following the chart are instructions on how to perform each test in Excel Page 76 How to process your data in Excel Descriptive statistics (Mean, Standard deviation, 95% confidence interval) The Mean and Standard deviation and Confidence intervals are easily calculated in Excel using the following functions; Mean =AVERAGE(range) where range is the range of cells containing the data Standard deviation =STDEV(range) where range is the range of cells containing the data 95 % confidence interval =CONFIDENCE.T(0.05,st_dev,size) Use the flow chart above to help you decide which is the most appropriate test for the type of question you want to answer. Make sure the data you collect allows you to perform the test you would like to do! where 0.05 signifies the confidence level (you can use 0.1 if you want the 90% CI etc.), st_dev is the standard deviation (as calculated above) and size is the number of data points in the sample. The standard deviation tells you how variable the data you sampled was. Small standard deviations indicate that all of the data was close together around the mean. A large standard deviation suggests that the data varied a lot and were spread out around the mean. The 95 % confidence interval gives you a range around the mean that has a 95% chance of containing the true mean of the data. You should have at least 5 measurements before you try to calculate standard deviation or 95% confidence interval. Less than this and the value is not really reliable and you should use a different estimation of the spread of the data such as the range or . √ Null and alternative hypotheses It is important to remember that these hypotheses are for the STATISTICAL TEST and not for your investigation. You do not need to state a hypothesis as part of your method (although you should have some idea of what you are expecting to find and why!). When you start to analyse your data you should say which test you are going to use and what the Null hypothesis (H0) and alternative hypothesis (HA) are. In general your null hypothesis will be that there is NO significant difference/correlation in your results and the alternative hypothesis is that there IS a significant difference or correlation. The statistical tests calculate the probability that the null hypothesis is true. A high probability value indicates that any differences or correlations appeared by chance, a low value that the differences or correlations in the data are significant. How significant is the difference in my data? Statistical tests are used to give you a probability that your sets of data are the same, that your data is correlated or that your data fits the expected pattern. They cannot tell you WHY the difference or correlation does or does not exist! There is always a chance that the statistics will give you a false positive (i.e. saying there is a significant result when there is not) or a false negative (saying there is no significance when there really is). Because of this scientists usually use a 5% cut off when interpreting the results of tests of difference (there is no good scientific reason for this and it is still a controversial topic) although some scientists will say their data is significant at the 10% level and others will only accept it as significant if it is less than 1%. Generally biologists will use higher significance levels (10% or 5%) than chemists of physicists as it is much harder to control other variables in biological systems. As part of your evaluations you will have to decide how significant you think the result of your statistical test is! Page 77 Testing for differences between 2 sets of data T tests In Excel the T.TEST function gives you the probability of the 2 data sets (that are separate samples of different individuals) being the SAME. A T test to look for a difference between 2 sets of data is carried out using the following formula; =T.TEST(array 1, array 2, tails, type) Array 1 and 2 are the data sets. See below for discussion of number of tails and type of test. Data must come from a NORMALLY DISTRIBUTED population for you to perform a T test and there should be at least 10 measurements in each set (but there do not need to be the same number in each group!). One or two tailed? The number of tails depends on the data. If you have a good scientific reason for believing that one group MUST be larger than the other then you do a 1 tailed T-test. If the difference in means could be in either direction, that is you do not have a reason for thinking that one group must be bigger then you should do a 2 tailed test. If in doubt use a 2 tailed test as you are less likely to get a false positive. Examples; You want to know at what age children stop growing. You measure twenty 16 year olds and twenty 17 year olds to see if there is a significant difference in height between the two groups. You know that IF there is a difference then the 17 year olds will be taller as children grow with age, or stay the same, but never shrink. Because of this you do a 1-tailed T test. You want to know if boys or girls have a faster reaction time. You test twenty boys and twenty girls. There is no scientific reason for thinking that boys MUST be faster than girls so in this case you do a 2-tailed T test. Paired or unpaired? A paired T test is used when the 2 sets of data come from pairs of measurements of the SAME people. This is often (but not always) a measurement before and after a treatment or a period of time. e.g. if you wanted to know if exercise makes pulse rate increase you would measure the pulse rate of ten people before and after exercise. The data is in PAIRS (person A before and after exercise) and it matters who the data was collected from. This makes a PAIRED T test the correct choice. An unpaired T test is used when the 2 measurements are from different sets of people. In a test to see if boys are taller than girls, the people measured for the girl category are completely different set of people to those in the boy category. In this case an unpaired T test is used. Homoscedastic or heteroscedastic? In simplified terms, this is about the variability of data. Generally if your measurements all have a similar reliability (e.g. measuring the height of people using a stadiometer) then the data will be homoscedastic. If the some readings are less reliable than others (e.g. estimating bird populations – the counts of small populations will be more accurate than the estimates for large populations) then data may be heteroscedastic. If in doubt use the heteroscedastic test as it is more robust and less likely to give a false positive. Testing for differences between more than 2 sets of data It is bad science to do multiple T tests if you have more than 2 sets of data as by doing so you are increasing the chance that you will get a false positive when there really is no significant difference. You can solve this by doing an ANOVA (ANalysis Of VAriance). This is similar to a T test. To do this in Excel is simple, but you need to have the Data Analysis package installed and activated before it can be used. Page 78 Below is the output of the data for marks in the General Knowledge quiz of randomly selected P6, S3 and UB students. Enabling the Data Analysis Package To enable this click on ‘File’ in any Excel document, choose ‘options’ from the menu at the left hand side; a new window will pop open. Click on ‘Ad-ins’, in the drop down menu at the bottom of the window select ‘Excel Add-ins’ and click ‘GO’ near the bottom right of the page. Check the ‘Analysis ToolPak’ option and click OK. This should activate the package. You should be able to see it as an option when you click on the Data tab in Excel. Carrying out an ANOVA analysis For example you have conducted IQ tests on blue, brown and green eyed people. Rather than do 3 T-tests (blue v brown, blue v green, green v brown) you need to do an ANOVA. Your data may be grouped in rows and look something like this… To perform an ANOVA Click on the ‘Data’ Tab and then select ‘Data analysis from the right hand side of the menu. A pop-up will appear. Choose ‘Single factor ANOVA’ (you are only changing ONE factor – eye colour). In the box that appears select ALL the data including headings, Click that the data are grouped by column and tick the ‘labels in the first row’ box. Tell it to output the data generated into a new worksheet and click ok. The output looks complicated but is actually easy to understand… It should look something like this; In this case the P-value is much greater than 0.05 so we can conclude that there is no significant difference between the IQ of people with different coloured eyes. From this you can see the P-value is 8.2 x 10-19 which is MASSIVELY lower than 0.05. This tells us that the year group DOES have a significant effect on the score in the GK quiz. Looking at the data you can clearly see who has the highest score and who has the lowest score. Looking for CORRELATIONS between variables. Page 79 What does it tell us if data are correlated Spearman’s Rank test for correlation Correlations tell us if 2 things vary together. This could be because one is causing the other or it might be that the two things are being affected by a third unknown variable or a complex set of changes This test uses the same formula as the Pearson correlation BUT the test is carried out on RANKED data, not the raw data. You can rank the data in 2 ways… If you have small amounts of data you can rank it by hand – the highest value in each set of data receives the value 1, the next highest 2 etc. If 2 pieces of data share the same value then their ranks are averaged. e.g. lung cancer is correlated with smoking because chemicals in tobacco smoke CAUSE tumours to form; if you reduce the amount you smoke you can reduce your risk of cancer. Height is correlated with shoe size because the genetic and environmental factors that affect your height also affect all of your other bones. If I fed you more protein and calcium you would get taller AND your feet would grow larger. However if I stretched you, your feet would not magically stretch too – there is no cause and effect. Climate change has increased as piracy has fallen; this is not because pirates reduce global warming but rather because both are affected by the complex changes in society over the last 200 years! If we introduced more pirates there would be no reduction in climate change! Which test? If you have a lot of data it may be quicker to do it using the following formula in Excel =RANK.AVG(number,ref,order) Where number = the cell reference of the value you are putting into order (e.g. A2), ref = the cell references of the range of data you are comparing it to (e.g. $A$2:$A$11 – the dollar signs give you the ‘absolute’ cell reference so that when you copy and drag the formula down it still looks at the same reference – just press F4 when your data is selected to do this!) and order = 0 for ascending or 1 for descending order. e.g. when testing to see if there is a relationship between height a nose size the formula to rank the Height would be =RANK.AVG(A2,$A$2:$A$11, 0) and the data may look like this… Testing to see if data is correlated is easy to do in Excel. There are two types of test Pearson Correlation and Spearman’s Rank. To decide which one to do you need to think about your data. If it is CONTINUOUS data that is normally distributed in the population then you can use a Pearson test. If not then you need to use a Spearman’s rank. Pearson test for correlation (r). To carry out a Pearson test for correlation you use the formula PEARSON(array1, array2) where array one is your first set of data and array 2 is your second set of data. The value returned is a number between -1 (for very strong negative correlation) to 1 (for very strong positive correlation). A value of 0 means no correlation. You can test the significance of your value by comparing your value with the critical value for the sample size you have tested. This will allow you to decide if the correlation you have measured is real or has occurred by chance. To do this decide on how many degrees of freedom your data has (in the case of correlations df = N-2 where N is the number of data points you have.). You look this up in a table of critical values for the Pearson correlation coefficient and see if your value is larger than the value given for the 0.05 significance level. If it is you can reject the null hypothesis that the correlation has occurred by chance and accept the alternative hypothesis that there is a real correlation between the 2 measurements. You can find tables for the Pearson correlation coefficient here; http://www.gifted.uconn.edu/siegle/research/correlation/corrchrt.htm e.g. in an experiment to see if arm span was related to height 7 people were measured and a correlation coefficient of 0.973 was calculated. The number of degrees of freedom are n-2 = 5. The critical value at p=0.05 for 5 df is 0.754. Your value is higher than this so the correlation IS significant. You report your findings in the following way; A Pearson test for correlation was carried out and found the following; r(5) = 0.973, p<0.05 i.e. r(number of degrees of freedom)= calculated coefficient, probability value used As for the Pearson Correlation a value close to 1 or -1 signifies very strong correlation and close to 0 signifies no correlation. You can use the tables for the Pearson coefficient to determine the significance of the values in the same way as for a Pearson correlation. Page 80 Testing to see if your observed data matches the expected data The CRITICAL value for χ2 is 7.82. This is MUCH lower than our calculated value therefore we REJECT the null hypothesis that there is no preference and ACCEPT the alternative hypothesis that there IS a significant difference in the number of people choosing different fizzy drinks. Looking at the data it is obvious that this difference is a preference for Inca Cola. What type of data? This test if for testing if the observed FREQUENCY of something matches the null hypothesis ‘expected’ values. It can only be performed when you have counted something. Your Null hypothesis is USUALLY that there will be no difference in frequency between the groups (e.g. if I asked 100 people to choose a red, green, blue or white t-shirt then my null hypothesis would be that 25 would choose red, 25 blue, 25 green and 25 white). If our value had been smaller than 7.82 we would have had to ACCEPT the NULL hypothesis and conclude that all the drinks tested were equally popular. The only exceptions to this are genetic crosses, where you could be expecting a 3:1 ratio or a 9:3:3:1 ratio, or work with populations that have a known make-up (e.g. if I know that there are 100 girls and 50 boys in a school my null hypothesis for a class of 15 students would be that there would be 10 girls and 5 boys) To perform a Chi-Squared test you must have COUNTED something and the expected values must be at least 5. The EXPECTED values do not have to be whole numbers! If you do Higher biology they expect you to be able to calculate Chi-squared by hand. If not you can do it using Excel! Calculating Chi Squared by hand For example you want to decide if one soft drink was more popular among P6 students that others so you gave away soft drinks to 100 P6 pupils and give them a choice of Coca Cola, Inca Cola, Fanta and Sprite. You recorded the OBSERVED numbers of pupils choosing each drink and got the following data; Inca cola Observe d Coca cola 52 Fanta 20 18 Sprite 10 Your Null hypothesis is that no drink is more popular than the others so your EXPECTED values are 25 for each drink. You first need to calculate the Chi squared value and then look to see if the critical value for the number of degrees of freedom in the χ2 table is statistically significant. The formula for Chi squared is as follows; χ2 ∑ Where O = the observed value and E = expected value. It is useful to make a table to show the calculation process like this. Drink Observed Expected O-E (O-E)2 𝑂 𝐸 𝐸 Inca Cola 52 25 27 729 Coca cola 20 25 -5 25 1 Fanta 18 25 -7 49 1.96 Sprite 10 25 -15 225 𝑂 𝐸 2 29.16 9 41.12 𝐸 The number of ‘degrees of freedom’ is the number of categories minus 1. so χ2 = 41.12 (3 d.f). You look up the critical value for a probability of 0.05 at 3df in the tables Calculating Chi‐squared using Excel You need a results table that contains the Observed data and Expected data. Use the formula =CHISQ.TEST(actual_range, expected_range) where ‘actual range’ is the observed data and ‘expected range’ is your expected data. It will work whether your data is arranged in rows or columns. The number this formula returns is the PROBABILITY that the null hypothesis is true (i.e. that the observed and expected values are the SAME). As normal if this is less than 0.05 then the test shows a SIGNIFICANT difference and the observed and expected values are DIFFERENT. If the value is greater than 0.05 then you have to ACCEPT the hypothesis that the observed and expected values are the SAME. Page 81 B io Factsheet September 2000 It is worth noting the way that the critical values vary with sample size; with a large sample, it is much easier to get a significant result! The students will all collect their data at different times, and hence get slightly different results. If they all carry out their statistical test at a 5% significance level, then on average five of them would find themselves rejecting the null hypothesis, because they happened to get "odd" data. Number Number77 74 Hypothesis Tests & Mann-Whitney U-test The critical value is just the value we have to compare with the number we have worked out - the test statistic - to decide whether or not we should reject Ho. Each significance level - and each sample size - has its own critical value. Critical values come from books of statistical tables. Many biology projects involve a hypothesis test. Students often find difficulty in deciding on suitable hypotheses, and accordingly can waste time collecting unhelpful data. This Factsheet explains what is involved in a statistical test of a hypothesis and discusses the role of the null hypothesis and the level of significance. It also covers in detail the calculation and use of the Mann-Whitney U-test, which may be applicable to sets of biological data. For the Mann-Whitney U-test, we reject Ho if our value is smaller. For every test except Mann-Whitney, we reject Ho if our value is bigger than the critical (tables) value. Hypotheses Bio Factsheet Hypothesis Tests & Mann-Whitney U test Table 2. Critical values for the U-test n1 1 sample 5 6 7 8 The role of the statistical test is to give an objective definition of what constitutes sufficient evidence to reject H0. All statistical tests involve testing a null hypothesis (H0) against an alternative hypothesis (H1 ). Choosing Hypotheses The null hypothesis can be described as "the boring case" - i.e. nothing has changed. For example: H0: There is no difference in species diversity in riffles and flats in a particular river Good hypotheses for a statistical test must: • be specific, not vague or general • • • H0: There is no correlation between the yield of a particular crop and the amount of water supplied refer to something that can be measured in an unambiguous way be simple and not attempt to include several variables include a null hypothesis Table 1. shows some examples of "bad" hypotheses, and how they can be improved. H0: There is no difference in the amount of lichen on the north and south sides of a tree Doing the test The null hypothesis cannot ever be of the form "something is greater than something else" or "something is related to something else". The exact form of it depends on the test you are using (see September 1997, Bio Factsheet No. 03 - Which Stats Test Should I Use?). Whichever statistical test is used, we will effectively be plugging our data values collected in the experiment into some formula, and coming out with a single number. This number is what we will use to decide whether or not to reject the null hypothesis. The alternative hypothesis is effectively saying the opposite of the null hypothesis - for example, for the last case above the alternative hypothesis would be: H1: There is a difference in the amount of lichen on the north and south sides of a tree To make that decision, we will have to compare this number we have worked out - which is often called a test statistic - to the appropriate statistical table. There are different tables for different statistical tests. Table 2 (overleaf) shows an extract from a statistical table for the Mann-Whitney U-test. Statistical tables give critical values at various significance levels. When we carry out an investigation, we start off assuming that the null hypothesis is true, and only change our minds if the data obtained in the investigation provides strong enough evidence. This is rather analogous to the situation in a court of law, where the defendant is assumed innocent unless proven guilty! The significance level is a measure of how strong we are requiring the evidence to be before we reject H0. Common significance levels used are 10% (0.1), 5% (0.05) and 1% (0.01). A 1% significance level, for example, means that we have only a 1% chance of rejecting Ho when we shouldn't have, whereas a 10% level would give us a 1 in 10 chance of rejecting H0 when we should have accepted it. Obviously there will always be some chance variations - if, for example, the areas of lichen on the north and south sides of the tree only differed by 0.1 mm2 ,we would probably feel - even without conducting any tests! - that this was not a "significant" difference. To get an idea of what this means, imagine 100 students carrying out the same investigation into areas of lichen on the north and south sides of the same species of tree.. We will imagine that there is really no significant difference in area - in other words, the null hypothesis is true. Table 1. Choosing a hypothesis Original Hypothesis What's wrong with it Improved version the more you smoke, the less fit you are • Not specific - how do you measure fitness? • Need to eliminate other variables such as age • There's no null hypothesis. H0: there is no correlation between the number of cigarettes smoked and recovery time. H1: there is some correlation between the number of cigarettes smoked and recovery time . H0: there is no correlation between lead levels and distance from the road. H1: there is some correlation between lead levels and distance from the road. the closer to the road, the higher the pollution. • Not specific - what sort of pollution? • How will it be measured? • There's no null hypothesis. slope affects vegetation. • Not specific - which aspect of the slope is referred to? Is it the gradient, the altitude or the length of the slope? • Not measurable - you cannot just measure "vegetation". Should it be species diversity, or percentage cover, or biomass, or incidence of a particular species? • There's no null hypothesis. 1 H0: the gradient of the slope has no effect on percentage cover. H1: the gradient of the slope has some effect on percentage cover. α 10% 5% 10% 5% 10% 5% 10% 5% n2 2 sample 5 6 7 8 4 2 - 5 3 7 5 - 6 5 8 6 11 8 - 8 6 10 8 13 10 15 13 Mann-Whitney U-test We use the U-test to compare the median values of two sets of data - e.g. the species diversity on a path and GLOSSARY off a path. We are just trying to find out whether there is a difference - e.g. whether being on a path affects the Simpson's Diversity Index is diversity. calculated using the formula The hypotheses to be tested are: H0 : there is no difference between X and Y Diversity = N (N-1) H1 : there is a difference between X and Y Σ Σn(n-1) (If you wish to be mathematically correct, you would use H0: median1 = median2; H1: median1≠ median2) We where n refers to the number of will reject the null hypothesis if the value we calculate (the test statistic) is below the value from the tables individuals of each particular species (the critical value). and N is the total number of The procedure for carrying out the test will be illustrated by applying it to data on species diversity on and individuals. off a path. APPLICATION H0: There is no difference in species diversity on and off the path METHOD H : There is a difference in species diversity on and off the path 1. Write down your hypotheses 1 2. Obtain data about the things you wish to compare - you need two sets of data Each set of data must contain at least 5 values, but they don't have to have the same number of values as each other. The figures for species diversity in 8 path sites and 8 off-path sites are: Site 1 2 3 4 5 6 7 8 on-path 2.20 4.65 6.00 3.47 4.33 2.20 2.50 3.33 off-path 4.09 2.93 3.88 10.50 3.50 5.14 4.40 10.00 3. Consider one set of data at a time - say we start with the onpath sites. We now must calculate a score for each on-path site. A site is given: 1 for every off-path site that has a higher value 0.5 for every off path site that has an equal value. Site 1, on-path: The following off-path sites have higher values: 1,2,3,4,5,6,7,8. Hence the score is 8 Similarly, for the other on-path sites, the scores are: Site 1 2 3 4 5 6 7 8 Score 8 3 2 7 4 8 8 7 4. Sum the scores of the on-path sites. This gives you the overall on-path score So the total on-path score is 47 (8+3+2+7+4+8+8+7) 5. Repeat steps 3) and 4) for off-path sites. This time, you will be awarding points for on-path sites with higher or equal scores. We now find the scores for off-path sites: Site 1, off-path : The following on-path sites have higher values: 2,3,5 Hence the score is 3 Similarly, for the other off-path sites the scores are Site 1 2 3 4 5 6 7 8 Score 3 5 3 0 3 1 2 0 So the total off-path score is 17. 6. Take the smaller of the two scores. This is the U-value So the U-value is 17 7. Compare your calculated U-value with the critical value in the tables at the appropriate significance level. If your value is smaller, reject H0 - otherwise accept H0. The critical value for two samples of 8 is 13 at the 5% level of significance. Since our value is 17, we accept H0 and conclude that there is no significant difference between species diversity on and off the path. Projects using the Mann-Whitney U-test • Find the recovery time after exercise (the time taken for the heart rate to return to normal) for at least five smokers and five non-smokers of approximately the same age. Test whether or not there is a difference in recovery rates. • Measure the percentage vegetation cover at at least 5 sites on shaded and non-shaded slopes. Test whether or not there is a difference in % cover. • For a river with a source of pollution, find the species diversity at five or more sites above and five or more sites below the source of pollution. T. Test whether or not there is a difference in species diversity. Acknowledgements; This Factsheet was researched and written by Cath Brown. BioPress Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136 2 Page 82 B io Factsheet January 1999 Species Diversity Box 1. Calculating Diversity Number 34 Species Diversity Species New species discovered annually (as a % of those already known) Measuring species diversity The simplest way of measuring diversity is to count the number of different species. A garden on the outskirts of a small village might be visited by approximately thirty different species of birds; a city garden will probably have many fewer. So we can count the number of species present under standard conditions and produce a quantitative measure of diversity. There is a problem here, however. Look at the example in Table 1. Proportion of species described Birds 0.8 High Reptiles 1.17 High Platyhelminthes 1.58 Moderate Fungi 2.43 Very low Table 1. Number of different species of plant found in two areas Species A B C D E What seems more certain is that, in terms of number of species, insects rule the planet! (Fig 1). Fig 1. Possible proportions of total species A C B D F Total number of plants in Quadrat X Quadrat Y 95 2 1 3 1 18 23 27 14 20 This shows the plants found in two quadrats in some sand-dunes on the Welsh coast. There are five species in each but common sense suggests that the overall diversity is very different. Nearly all the plants in quadrat X belong to species A. In quadrat Y all five species are present in large numbers. Quadrat Y seems to have greater diversity than quadrat X. What we need is a way of calculating diversity which takes into consideration the number of individuals as well as the number of species. There are many different ways of doing this but one of the simplest is to calculate an index of diversity using the formula shown in Box 1 (overleaf). E G H I Why is diversity important? K KEY A Vertebrates 0.4% B Algae 1.6% C Protozoans 1.6% D Moluscs 1.6% E Plants 3.0% J F G H I J K Bacteria 3.0% Viruses 4.0% Nematodes 4.0% Arachnids 6.0% Fungi 8% Insects 65.0% Exam Hint - As a general rule, the greater the species diversity in a particular ecosystem, the more stable it is. Species The index of diversity (d) is calculated from the following formula d = N(N-1) where N = total number of organisms of all species Σn(n-1) & n = Total number of organisms of a particular species Dandelion Oxford ragwort Common sowthistle Buddleia Mugwort Number of plants of this species in study area 7 28 1 2 5 d= 43 x 42 (7 x 6) + (28 x 27) + (1 x 0) + (2 x 1) + (5 x 4) d= 1806 42 + 756 + 0 + 2 + 20 d = 1806 = 2.2 820 On its own, this figure of 2.2 tells you very little. It does, however allow you to compare the diversity of plants growing between the railway lines with the diversity of plants growing in other areas. In this particular case, the diversity was much lower than on a disused piece of rail track nearby. Diversity and Succession Succession is the ecological process in which the different species of organisms in a community are gradually replaced by others over a period of time. Sand dunes are found in many areas around the coast. Near the sea, abiotic conditions are harsh. The sand is blown by the wind and is unstable. It contains little humus and therefore dries out very rapidly. There are also low levels of soil nutrients such as nitrates. One of the few plants able to survive in these conditions is marram grass. In time, the roots of the marram grass bind the sand particles together, Plants die and decompose, increasing the humus in the soil. Marram grass can no longer survive and it is replaced by other species of plant. It is the basic principle that we applied earlier. As succession takes place, abiotic conditions become less severe. More species occur and there is a higher diversity. This is summarised in Fig 2. Fig 3. A rocky sea shore Upper shore: Harsh environment where, over the course of a tidal cycle, the temperature and salinity can vary enormously. Organisms are completely covered in water at high tide while, at low tide, they are exposed to the drying effect of the air. Fig 2. Sand dune succession Early stages in succession This is a very hostile environment. It is very exposed and sand grains are continually being blown by the wind. There is little humus in the soil and water is not retained very well. Few species can survive in these conditions. Later stages in succession Conditions are far more suitable for plant growth. Plant cover means that the sand is no longer being blown away. The humus in the soil resulting from dead plant remains enables it to hold water much better. More species grow here. In order to interpret information about diversity we need to understand a very important principle. The distribution of living organisms is influenced by abiotic factors such as the amount of rainfall, soil pH, temperature and so on. The more extreme these abiotic conditions, the fewer the species that can survive and, therefore, the lower the diversity of organisms found there. Sea We will use this principle to compare the diversity of living organisms found in the Arctic with the diversity of those found in tropical rain forests. In the Arctic, over the long winter period, temperatures rarely rise above freezing and, as a consequence, water remains biologically unavailable, frozen solid as ice. The Arctic winter is not only cold but dark, for several months the sun barely shows above the horizon. These are clearly extremely harsh abiotic conditions. Not surprisingly, relatively few species are adapted to survive an arctic winter. Arctic ecosystems therefore tend to have low diversities. Result: Increasing diversity as succession progresses However, within a tropical rain forest, there is water in abundance and temperatures are high throughout the year. Many organisms can survive in these conditions and the species diversity in such places can be very high. 1 Table 2 shows the number of plants of different species growing between the railway lines in a station. Table 2. Species diversity is a very important ecological idea. It can be expressed mathematically and describes the number of individuals and the number of species in a community. It is estimated that there are 13-14 million different species on Earth. Humans have recorded about 2 million of these; in other words we simply know nothing about large parts of the animal and plant kingdom (Table 1). The ecosystems with greatest species diversity are tropical rainforests, coral reefs and large tropical lakes. Table 1. Proportion of species discovered annually Bio Factsheet Lower shore: Organisms are almost always under water. Temperature and salinity fluctuate very little. Dehydration is not a problem. Inland Result: Many more species of marine organisms can live on the lower shore. Therefore diversity increases down the shore. Diversity and Pollution Human activity frequently leads to pollution of the environment. Pollution results in harsher environmental conditions, so the more pollution, the lower the diversity of organisms. Diversity and zonation Biologists can make use of this idea to monitor pollution levels. A number of different indices of diversity have been designed to assess water quality. They take into account the fact that some groups of animals are much more sensitive to pollution than others. Each group that is present at a particular site is given a value, the values are added together and a figure is obtained which provides an idea of the amount of organic pollution at the site. Abiotic conditions often vary within an ecosystem. Around the high tide level on a rocky seashore, for example, abiotic conditions are extreme. Organisms that live there must be able to withstand considerable fluctuations in temperature and salinity. On the lower shore, abiotic conditions show much less variation. Organisms are covered by sea water for much of the day. Temperature and salinity will vary little. Applying our general principle again, as we go down the shore, abiotic conditions become much less severe (Fig 3). Acknowledgements; This Factsheet was researched and written by Bill Indge Curriculum Press, Unit 305B, The Big Peg, 120 Vyse Street, Birmingham. B18 6NF Bio Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136 2 Page 83 % LR )DFWVKHHW www.curriculumpress.co.uk www.curriculum-press.co.uk %LR)DFWVKHHW 144 Spearman's Rank Correlation Coefficient Number 144 Spearman's Rank Correlation Coefficient Spearman's Rank is one method of measuring the correlation between two variables. Correlation may be: • positive (large values of one variable associated with large values of the other variable - eg nitrate concentration and plant growth) • negative (large values of one variable associated with small values of the other - eg soil salinity and plant growth ) Correlation is measured on a scale from -1 to 1 www.curriculum-press.co.uk Ranking Ranking is similar (though not identical) to awarding places in a race. When doing the ranking, it does not matter whether you give the rank "1" to the largest value, or to the smallest value - provided you are consistent. If there are no ties, you just give out the ranks in the obvious way, starting at 1 and carrying on to however many pieces of data you have. If there are ties, you have to be a bit careful: For example, suppose three pieces of data tie for 4th place. Normally, if there hadn't been any ties, you'd expect the next three pieces of data to "use up" the ranks 4, 5, 6 So we give all three pieces the average of 4, 5 and 6 - that's 5. The next piece of data then has rank 7 (as ranks 4, 5 and 6 have been "used up") Worked Example The data below were collected on soil salinity and plant height. Soil Salinity 28 12 15 16 2 5 Plant height (mm) 10 40 40 52 75 48 -1 Perfect negative rank correlation 0 No correlation 1 Perfect positive rank correlation Which correlation coefficient? Hypotheses There are three correlation coefficients in common use; Spearman's is used most often (and hence is the principal subject of this Factsheet), but there are cases when the other coefficients should be conisdered: As with any other statistical test, you are using the test to decide between two hypotheses: - Spearman's Rank Correlation Coefficient • Can be used for any data that you can put in order smallest to largest • Measures whether data are in the same order - eg does highest nitrate concentration coincide with highest plant growth - rather than using actual data values • Not valid if there are a lot of ties (eg several pairs of samples having the same pollution level), although one or two ties is OK. • Easy to calculate for small data sets, but unwieldy for large data sets. NO Do the data look close to a straight line? RANK CORRELATION The alternative hypothesis can take three possible forms: a) H1: there is some correlation between X and Y b) H1: there is positive correlation between X and Y PEARSON'S SPEARMAN'S We'll give rank 1 to the highest values for each: Soil Salinity Rank Step 3: Work out "d" and "d2", where d stands for the differences between pairs of ranks Note: you must square each d individually Step 4: Substitute into the formula or If you have a good scientific reason in advance (before actually getting any results) for expecting a particular type of correlation, then choose b) or c). If you do not have a reason for expecting a particular type, use a). If in doubt - use a) Alternative hypotheses b) and c) above are referrred to as directional because they specify a particular "direction" of correlation. Alternative a) is non-directional. When you are doing the actual statistical test, you need to be aware that a non-directional alternative requires you to do a 2-tailed test, but a directional alternative requires a 1-tailed test - further details are given in the worked example overleaf. Sample Size The absolute minimum number of values for using Spearman's Rank is 4 - but it is very hard to get a significant result using this few! It's best to use at least 7 - and if you can get up to about 15, better still. Very large sample sizes (50+) can make it hard to handle the calculations, and many Spearman's tables do not go up this high. YES KENDALL'S 1 28 12 15 16 1 4 3 2 2 6 5 5 Plant height (mm) 10 40 40 52 75 48 Rank 6 4.5 4.5 2 1 3 or c) H1: there is negative correlation between X and Y Are there a lot of ties? NO the alternative hypothesis (H1) - which is what you hope to get evidence for. Exam Hint: - Only the alternative hypothesis can be directional the null hypothesis is never directional. YES YES • Step 2: Work out the two sets of ranks, taking care to allow for ties. H0: there is no correlation between X and Y Pearson's Product Moment Correlation Coefficient • Can only be used for continuous data (eg lengths, weights) • Uses the actual data, not just their ranks • Measures how close to a straight line the data are - check on a scatter graph that the data do approximate a straight line rather than a curve. • Can be easier to get significant results than using rank correlation • A nuisance to calculate by hand, but can be calculated automatically on many graphic calculators and using a spreadsheet • If you are unsure whether it is valid, it's better to use rank correlation The flowchart shows how to choose your correlation coefficient. NO the null hypothesis (H0) - which is what you assume, until you get convincing evidence otherwise. H0: There is no correlation between soil salinity and plant height H1: There is negative correlation between soil salinity and plant height For any test of correlation, your null hypothesis is always: Kendall's Rank Correlation Coefficient • Like Spearman's, uses the ranks of the data rather than the actual data, and can be used for any data that can be ordered. • A good substitute for Spearman's if there are a lot of ties • More of a nuisance to calculate than Spearman's Is the data continuous? (eg lengths, weights etc) • 7KLVLVDVHQVLEOHFKRLFH SURYLGHGZHNQRZWKH SODQWLVQRWDKDORSK\WH Step 1: Write down the hypotheses rs = 1- 6Σd 2 (n3 - n) Soil salinity rank Plant height rank d d2 1 6 5 25 4 4.5 0.5 0.25 3 4.5 1.5 2.25 2 6 2 1 0 5 0 25 5 3 2 4 Σd 2 = 25 + 0.25 + 2.25 + 0 + 25 + 4 = 56.5 rs = 1- 6 × 56.5 = 1 (63 - 6) - 7KHWZR´µYDOXHVWLH 7KH\·GQRUPDOO\KDYHXVHGXS WKDQGWKSODFH²VRJLYH WKHPERWKWKHDYHUDJHRI DQG²WKDW·V 7KHQH[WRQHZLOOKDYHUDQN DVUDQNVDQGKDYHEHHQ XVHGXS n=6 339 = -0.6142 210 rs = Spearman's Rank Correlation Coefficient Σd 2= sum of the d2 values n = number of pairs of values in sample Step 5: Get a Spearman's table and look up the critical value for the appropriate significance level (usually 5% = 0.05), sample size and 1-tailed or 2-tailed test. We have n = 6, and we are doing a one-tailed test, because of the form of H1. So critical value is 0.771 Step 6: Make a decision - if your calculated chi-squared value is bigger than the critical value (ignoring signs), you can reject the null hypothesis. Otherwise you must accept it. 1-tail 2-tail n 4 5 6 7 0.1 0.2 0.05 0.10 0.025 0.05 0.01 0.02 0.005 0.01 1.000 0.700 0.657 0.571 1.000 0.900 0.771 0.679 1.000 0.900 0.829 0.786 1.000 1.000 0.943 0.857 1.000 1.000 0.943 0.893 Our value (-0.6142) is smaller than the critical value (ignoring signs) So we must accept the null hypothesis - there is no correlation between soil salinity and plant growth at the 5% significance level. Further Investigations Using This Test • Relationship between concentration of fungicide and zone of inhibition for a particular fungus • Relationship between molecular size and rate of metabolism in yeast • Relationship between algal growth and nitrate concentration • Relationship between blackspot disease in roses and traffic levels • • • • Relationship between mass of leaf buried and earthworm mass Relationship between pest density and yield for broad beans Relationship between body mass and running ability for house spider Relationship between pH of soil and pH of leaf litter Acknowledgements: This Bio Factsheet was researched and written by Cath Brown. Curriculum Press. Bank House, 105 King Street, Wellington, TF1 1NU. Geopress Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136 2 Page 84 % LR )DFWVKHHW January 2003 Worked Example In an investigation into the habitats preferred by light and dark- shelled snails, the following results were obtained: The Chi-SquaredTest for Association The chi-squared test for association (or association index or chi-squared contingency tables) is commonly used in projects to measure whether two factors are associated (e.g. whether greater numbers of a certain plant species occur in areas with high rainfall). This Factsheet explains how to use this test. When do I use this test? How does it work? This test is one way of examining whether two variables are related - for example, "Is the colour of hydrangea flowers related to the type of soil?" • The null hypothesis will be that the two variables are not associated or are independent. (This means that they do not affect each other) • The alternative hypothesis will be that the two variables are associated or are not independent. To see how this test works, let's look at hydrangea flowers in different soil types. The flowers can be pink or blue, and we'll look at 50 plants grown in clay soil and 50 plants grown in sandy soil. Suppose the results were: So, in the above example, the hypotheses would be: H0: Hydrangea flower colour is independent of soil type H1: Hydrangea flower colour is not independent of soil type pink 25 25 sand clay Limestone pavement Limestone woodland Number 120 www.curriculumpress.co.uk blue 25 25 Dark 85 111 H0: Shell colour is independent of habitat preference H1: Shell colour is not independent of habitat preference Light Dark 121 85 74 111 121 + 74 = 195 85 + 111 = 196 column totals Step 2: Work out the row, column and overall totals for the original data Limestone pavement Limestone woodland Total Step 3: Calculate the expected frequencies for each category using the formula row total × column total overall total Limestone pavement Limestone woodland Step 4: For each of your categories, work out eg for Light Pavement (121 - 102.7 )2 Light Pavement 3.26 Woodland 3.63 Dark 3.24 3.61 From these results, we would probably conclude that there was no link between flower colour and soil type - flower colour and soil type were independent. The test is used to compare observed frequencies (what is produced from the investigation) with expected frequencies (what you'd expect from the null hypothesis). Light 121 74 Step 1: Write down the hypotheses ( O - E) It is important to note that the alternative hypothesis does not tell you how the two variables are related - it could be that pink flowers occur in sandy soil and blue in clay soil, or vice versa. %LR)DFWVKHHW The Chi-Squared Test for Association 2 E (O = observed values, from the experiment E = expected values, from step 3) Light 206 × 195/391 = 102.7 185 × 195/391 = 92.3 102.7 Total 85 + 121 = 206 row totals 74 + 111 = 185 206 + 185 (or 195 + 196) = 391 overall total Dark 206 × 196/391 = 103.3 196 × 185/391 = 92.7 = 3.26 If, instead, the results were: sand clay Any chi-squared test can only be used with frequencies (that is, numbers of items in particular categories), and all expected frequencies must be at least 5. pink 50 0 blue 0 50 We'd conclude there was a link between flower colour and soil type. The easiest way to guarantee this is to make sure there are at least 5 items in each category, but if this really isn't possible, you can usually get away with one category having less than 5 if all the others have substantially more. If your expected frequencies turn out to be less than 5, you should collect more data and redo the test. But what if the results were: sand clay pink 30 20 blue 22 28 How would we decide whether there was "enough" difference, to decide that flower colour and soil type were linked? Step 5: Add up all these values. This gives the chi-squared value chi-squared value = 3.26 + 3.24 + 3.63 + 3.61 = 13.74 Step 6: Work out the degrees of freedom using the formula: (rows - 1)(columns - 1) Degrees of freedom = (2 - 1)×(2 - 1) = 1 Tables value for 1 df and 5% significance is 3.84 Step 7: Get a chi-squared table and look up the value for the appropriate significance level df .10 (usually 5%) and the degrees of freedom. 1 2 3 4 The chi-squared test for association gives us a way of deciding what constitutes "enough" difference. Table 1. Investigations using chi-squared test for association Investigation Whether colour of hydrangea flowers is affected by soil type Null Hypothesis Colour of flowers is independent of soil type What to Measure Choose two areas of differing soil types, and note the number of hydrangea plants in each area of each colour. Whether a particular type of caterpillar feeds preferentially on a particular type of plant Caterpillar species' feeding is independent of the type of plant In a specific area, note the number of plants of the type required with any of the caterpillars on them, the number of such plants without caterpillars, the numbers of other plants with caterpillars and without caterpillars. Whether snails of the same species with different coloured shells prefer different habitats Shell colour is independent of habitat preference. Using the same area quadrat in each habitat, note the number of snails of each type found. Whether dandelions and plantains grow preferentially together. Incidence of dandelions is independent of incidence of plantains Using a standard size quadrat, note the number of quadrats in which both species are found, in which neither are found, and in which just dandelions or just plantains are found. 1 2.71 4.61 6.25 7.78 .05 3.84 5.99 7.81 9.49 .025 5.02 7.38 9.35 11.14 .01 6.63 9.21 11.34 13.23 .005 7.88 10.60 12.84 14.86 Step 8: Make a decision - if your chi-squared value is bigger than the one from the tables, you can reject the null hypothesis. Otherwise you have to accept it. Our value is larger than the tables value, so we reject the null hypothesis. Step 9: Write down your conclusion. At the 5% level of significance, we can conclude that shell colour and habitat preference are not independent. As with any other statistical test, the results are only as reliable as the original data. In evaluating this sort of investigation, you should consider: • whether there are any other factors, other than those measured, which might affect the numbers of each type of snail • • • whether you sampled enough areas in each category whether your sampling method could have introduced any kind of bias whether the areas sampled are "typical" of limestone woodland and pavement. Acknowledgements: This Geo Factsheet was researched and written by Cath Brown. Curriculum Press. Unit 305B, The Big Peg, 120 Vyse Street, Birmingham, B18 6NF Biopress Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136 2 Page 85 B io Factsheet January 2001 Worked Example Number 79 The Chi-SquaredTest for Goodness of Fit A student decides to investigate whether pollution levels affect the incidence of asthma. They obtain a sample of 50 sixth formers from their own school, which is situated in a large, polluted city and 50 sixth formers from another school, which is situated in a small, relatively unpolluted town. Each sixth former is asked whether or not s/he suffers from asthma. The student obtains the following results: City: 16 with asthma The purpose of all statistical tests is to choose between two hypotheses - for example: "the leaves are smaller at the top of the tree than at the bottom" and "the leaves are not smaller at the top of the tree than at the bottom". The null hypothesis (H0) always has to be the "boring case" - that there's no difference between things. In the above example, it would be "the leaves are not smaller at the top of the tree" - or equivalently, "the leaves are the same size at the top and bottom of the tree". It can never be "the leaves are bigger at the top of the tree". The other hypothesis is called the alternative hypothesis (H1). In our leaves example, it would be "the leaves are smaller at the top of the tree". Step 3: Put your observed frequencies (from the actual experiment) and the expected frequencies (from step 2) in a table. When we carry out a test, we always start out assuming that the null hypothesis is true, and we only change our minds if we have enough evidence - it's like a trial, when you are assumed to be innocent unless there's enough evidence to show that you aren't! Table 1 shows some possible investigations using chi-squared and the corresponding null hypotheses. Table 1. Investigations using chi-squared Step 4: For each of your categories, work out Null Hypothesis (H0) What to Measure How the concentration of nitrate in water affects seed germination. The concentration of nitrate in water has no effect on seed germination How a particular type of pollution affects a particular organism The level of pollution has no effect on the numbers of the organism The number of seeds germinating within a set of time period when watered with solutions containing different concentrations of nitrate. The numbers of the organism obtained from a set area in two sites which are similar, except for one being polluted and one not polluted Are the predictions of genetics accurate? The numbers of organisms in each category are The numbers of organisms in each category in accordance with the predictions of genetics What is chi-squared? Exam Hint: - Marks are only awarded for an appropriate use of statistics. Decide exactly what your hypotheses are and what test you are going to use before collecting your data. Chi-squared goodness of fit is used to test whether the actual results of an experiment fit in with what we'd expect if the null hypothesis were true. To see how this works, imagine testing a normal coin to see whether heads and tails were equally likely. Our null hypothesis is that they are equally likely - we have no reason to believe otherwise before we do an experiment, and we always choose "the boring case" for the null hypothesis. If we tossed our coin 600 times, we'd expect to get about half of each - around 300 heads and 300 tails. ( O - E) 2 E City Observed (O) 16 Expected (E) 12 ( ) For the city: 16 - 12 12 If we then actually tossed the coin 600 times and got over 500 heads, we'd feel that this was a long way off from our predictions - so we'd probably decide that the coin was weighted. However, if we got 305 heads and 295 tails, we'd probably feel this was close enough, and decide the coin was OK. The chi-squared test lets us decide on an accurate basis what counts as "close enough". In order that the test be valid, it is also important that you should expect at least five items in each category. Sometimes it may be necessary to combine categories in order to achieve this - for example, if you were researching public attitudes by using a questionnaire, you might need to combine the responses "not very concerned" and "not at all concerned". Obviously, we can never be absolutely certain that our decision is correct - we could get 500 heads by chance even if the coin wasn't weighted. We can decide how far off the results have to be by carrying out the test at different significance levels. The smaller the significance level we use, the "further off" the results need to be for us to reject the null hypothesis. Using a smaller significance level is like requiring the evidence to be more convincing. Statistical tests in biology are usually carried out at the 5% significance level. Exam Hint: - Do not try to be too ambitious in your investigation. Testing one simple, easily measurable hypothesis involving only one variable successfully will gain more marks than an attempt at investigating a situation affected by many variables. 8 12 Exam Hint: - Don't worry if your expected frequencies are not whole numbers. They don't have to be! Do not round them to the nearest whole number - this will make your test less accurate. 2 2 = 1.33 For the town: ( 8 - 12 ) = 1.33 12 chi-squared value = 1.33 + 1.33 = 2.66 Step 6: Work out the degrees of freedom. This is one less than the number of categories Degrees of freedom = 2 − 1 = 1 Step 7: Get a chi-squared table and look up the value for the appropriate significance level (usually 5%) and the degrees of freedom. In the table 5% is shown as 0.05. We look for the 5% level for one degree of freedom. This is 3.84 (Table 2) Table 2. Chi-squared tables Step 8: Make a decision - if your chi-squared value is bigger than the one from the tables, you can reject the null hypothesis. Otherwise you have to accept it. Our value is smaller than the value from the tables, so we accept the null hypothesis - there is no significant difference in the amount of asthma between the city and the town. df 1 2 3 4 Exam Hint: - Don't worry about what degrees of freedom means! Unless you want to study Statistics as a subject, you don't need to know! 0.10 2.71 4.61 6.25 7.78 0.05 3.84 5.99 7.81 9.49 0.025 5.02 7.38 9.35 11.14 0.01 6.63 9.21 11.34 13.23 0.005 7.88 10.60 12.84 14.86 Points to note 2. This test is not telling us that the null hypothesis is definitely correct - it is telling us that we haven't got enough evidence to reject it. 1. This investigation needs care in sampling technique! • The sixth formers need to live in the area, not just go to school in it • They may not have lived there for long • • • 1 Town Step 5: Add up all these values. This gives the chi-squared value When can chi-squared be used? You can only use chi-squared when you have frequencies - that is, numbers of items in particular categories. You cannot use it to compare measurements or other figures directly. For example, if you were doing an experiment on seed germination, you could use chi-squared to compare the numbers of seeds germinating within a week in each of four different solutions. However, you could not use it to compare the heights of four different seedlings. Pollution has no effect on incidence of asthma Pollution has some effect on the incidence of asthma We do this by adding up all the people with asthma, and dividing by the number of different categories we're looking at - which is two (city and town). ÷ 2 = 12 in each area So the expected frequencies are (16 + 8)÷ Step 2: Work out the expected frequencies Hypotheses Town: 8 with asthma Hypotheses are: H0 (null hypothesis): H1 (alternative hypothesis): Step 1: Write down the hypotheses χ 2) test is widely used within project work. This Factsheet will tell you when and how to use it. Questions using the chiThe chi-squared (χ squared test may occur on exam papers, but you will not have to remember the formula - it will be given to you. Investigation Bio Factsheet The Chi-SquaredTest for Goodness of Fit 3. To improve the chance of getting a significant result - in other words, rejecting the null hypothesis - a larger sample usually helps! For example, if the student had taken a sample of 100 from each school and found the city and town had 32 and 16 respectively with asthma, s/he would have been able to reject the null hypothesis (check this calculation!) School sixth formers may not be representative of the population as a whole How does the student know what the pollution levels are - are they just assuming it? Pollution levels are not the same throughout a city or town. 2 Page 86 Bio Factsheet The Chi-SquaredTest for Goodness of Fit Worked Example A student decides to investigate the results of crossing plants with red and yellow flowers. The student knows that the allele for red flowers is dominant. S/he carries out the cross, and obtains 18 plants, all of which have red flowers. 1 a) Give the genotype of the 18 red-flowered plants produced. b) Explain what, if anything, can be deduced about the genotype of the parent red-flowered plant. 2 The student then crosses two of the offspring red-flowered plants and obtains 17 red-flowered plants and 3 yellow-flowered plants. c) i) Find the ratio of red-flowered to yellow-flowered plants that would be expected. ii) Carry out a chi-squared test at the 5% significance level to determine whether the results are in accordance with your predictions. 2 10 Answer and mark Scheme a) It must be Rr (since the yellow parent would be rr, and any rr offspring would be yellow); 1 b) Probably RR; since no yellow-flowered offspring are produced, (but this is not certain); 2 c) i) Rr crossed with Rr produces RR, Rr, rR and rr in equal proportion; Since the first three all are red-flowered, we would expect red-flowered:yellow-flowered to by 3:1; ii) Hypotheses are: Step 1: Write down the hypotheses H 0: Results obtained are not significantly different from 3:1 ratio H 1: Results obtained are significantly different from 3:1 ratio; 1 We do this using Step 2: Work out the expected frequencies Total no. of individuals Expected no. of = individuals in a category So expected for red Expected for yellow Step 3: Put your observed frequencies (from the actual experiment) and the expected frequencies (from step 2) in a table. ( ) Step 4: For each of your categories, work out O - E E 2 Red Ratio number for that category Total of all the numbers in the ratio = 20 × 3 = 15 ; 1+3 = 20 × 1 =5; 1+3 Exam Hint: - Check that your expected frequencies add up to the total number of individuals - 15 + 5 = 20 2 Yellow O 17 E 15 3 5 ( ) 2 × 2 Red: 17 − 15 = 0.2667 ; 15 1 2 Yellow: ( 3 − 5) = 0.8 ; 5 2 Step 5: Add up all these values. This gives the chi-squared value chi-squared value = 0.2667 + 0.8 =1.0667; Step 6: Work out the degrees of freedom. This is one less than the number of categories Degrees of freedom = 2 − 1 = 1; 1 Step 7: Get a chi-squared table and look up the value for the appropriate significance level (usually 5%) and the degrees of freedom. We look for the 5% level for one degree of freedom; This is 3.84 (Table 2 overleaf) 1 Step 8: Make a decision - if your chi-squared value is bigger than the one from the tables, you can reject the null hypothesis. Otherwise you have to accept it. Our value is smaller than the value from the tables, so we accept the null hypothesis - the results are not significantly different from the 3:1 ratio; 1 1 Total 15 Acknowledgments: This Factsheet was researched and written by Cath Brown. Curriculum Press, Unit 305B The Big Peg, 120 Vyse Street, Birmingham B18 6NF Bio Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission of the publisher. ISSN 13515136 3