Uploaded by Daniela Ponader

Bio IA starter pack

advertisement
IBDP Internal Assessment:
INDIVIDUAL INVESTIGATION
- BIOLOGY HL/SL –
Ms. Blanka Vrgoc
Page 2
Page 3
TABLE OF CONTENTS (with sources):
5
9
11
13
16
18
23
29
30
37
39
41
43
58
74
81
IA lab design & write-up guidelines (Blanka Vrgoc)
Marking criteria unpacked (source unknown)
Animal experimentation policy (IB)
Data presentation in biology (IB)
Choosing a statistical test (intro2r)
Guidance on the use of criteria (IB/unknown)
Sample paper (IB)
Sample paper - marks with comments (IB)
Sample paper - annotated (IB)
IA criteria
Grading rubric DRAFT self-assessment (IB/Blanka Vrgoc)
Grading rubric DRAFT peer review (IB/Blanka Vrgoc)
Grading rubric DRAFT post-peer review self-assessment (IB/Blanka Vrgoc)
Statistical booklet
1 (source unknown)
2 (Karl Schauer)
3 (source unknown)
4 (Bio Factsheet)
Page 4
Page 5
IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES
IN GENERAL:
• No student/teacher/school name, nothing identifiable
• 6-12 pages in total
− Bibliography included
− Excessive amounts of raw data can be attached as an appendix beyond the 12 page limit, but the
moderators are not required to read that (!), so make sure the 6-12 pages contains everything
necessary for understanding (up until and including bibliography)
− Page format – not prescribed, but has to be easy to read (e.g. A4, margins 1-2.5 cm,
single/double spacing)
− Page numbers are always good to have
− Table of contents is not required
− Word count is not important
• Font
− Style not prescribed, but has to be a standard easy-to-read one (e.g. Times New Roman, Calibri)
− Size not too big and not too small (e.g. font 11±1)
• Use appropriate subject-specific scientific terminology, watch out for spelling and grammar(!)
• You can use your own photos/drawings (e.g. for specific experimental set-up, to illustrate qualitative
results, etc.)
The following list should be used as a guideline of what an IA should contain:
1.
TITLE
What is your IA about?
•
•
Research question or a proper title (RQ rephrased into a statement)
Does not need to take up an entire page!
2. INTRODUCTION
What do you find interesting that you would like to know more about, and how will you test that?
 Explain a problem or question to be tested by a scientific investigation.
•
•
Explain a clear and focused reason why you chose to explore what you chose (for your IA) – how
you got to the idea and how you developed/adapted a procedure, including observations,
citations, or other studies that have lead you to this, why it’s relevant to you and how it’s
applicable elsewhere
Provide appropriate and relevant scientific background information on the topic
•
RESEARCH QUESTION
• Has to be introduced/restated (depending on the title) as a clear and focused question!
• Should include a brief mention of your independent and dependent variables and, if applicable,
the name of the organism studied (both scientific and common)
• Has to be focused, researchable, answerable, arguable, non-biased (avoid yes/no questions)
•
HYPOTHESIS
 Formulate and explain a testable hypothesis using correct scientific reasoning.
− Not obligatory, but advisable – should be phrased as a predicted answer to your research
question, based on your current knowledge (this is not what you expect to happen,
rather what you think will happen according to what you already know)
− The RQ and the hypothesis provide for a good anchor to refer to after you’ve conducted
your experiment, you seek to test (not prove!) your hypothesis to answer your RQ – so
it’s good to explicitly explain your results in the light of the two
Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria)
1/6
Page 6
IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES
−
This is an example of a lab-based hypothesis, or other experiments where you are setting
up the variables:
(DEPENDENT VARIABLE)
(INDEPENDENT VARIABLE)
If/When ___________________________,
then __________________________,
(use your knowledge to explain your prediction)
because __________________________________________________________.
−
•
VARIABLES


List variables and explain how to manipulate them.
Explain how sufficient relevant data will be collected.
−
−
−
−
−
−
•
The variables will vary for ecology or other observational experiments where you are
researching what is already happening
A table or a bulleted list is preferable
Bear in mind that variables as such are not necessarily applicable if you are researching
correlation (e.g. in ecology or genetics)
INDEPENDENT VARIABLE (min. 5 increments) – what you set up to test the effects of
DEPENDENT VARIABLE (min. 5 repetitions/replicates, but the more the better) – specify
what data you will collect and how
 what you will use to measure the consequences of the independent variable
(what will change that you can measure) – include the instrument error (half of
the smallest unit it can measure)
 has to be quantitative = measurable (qualitative is immeasurable, but good as a
visible/tangible observation)
CONTROLLED VARIABLES – everything that (you make sure) will be the same of all
experiments (specify how will you control)
CONFOUNDING VARIABLES – variables beyond your direct control, but that would have
the same influence to all experimental conditions (specify the influence if possible)
METHODOLOGY

Design a logical, complete and safe method using appropriate materials and equipment.
•
•
•
•
MATERIALS – list all the major pieces of apparatus, equipment and substances used
SET UP – labeled drawing/diagram/photo that shows the apparatus you used (not
necessary, but useful for certain specific experiment-based investigations)
SAFETY, ETHICAL, ENVIRONMENTAL CONSIDERATIONS – outline any safety concerns &
how they will be addressed (Animal Experimentation Policy!), where your materials are
coming from, what will happen to them afterwards, how wastes will be disposed of, etc.
(if something is not applicable, state so explicitly)
 if you are doing any experiment involving humans, you need to obtain informed
consent forms from them, and this needs to be explicitly addressed (an example
of such a form is something you could add as an appendix beyond the 12th page)
PROCEDURE – precise list of steps used (passive voice) – the reader should be able to
recreate your exact experiment and get the same result (a paragraph is acceptable, but
writing the “cookbook” in a numbered list makes it much more clear)
Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria)
2/6
Page 7
IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES
3. DATA COLLECTION & ANALYSIS
TRANSFORMING & PRESENTING DATA – data processing and/or statistical analyses, and any visuals (graphs,
etc.) that make the data easier to understand and make sense of
 Correctly collect, organize, transform and present data in numerical and/or visual forms.
•
•
•
RAW DATA COLLECTION
− Any qualitative observations made during the experiment should be mentioned
− Quantitative data includes raw data collected by measuring the dependent variable (and
anything relevant about the controlled variables, if applicable) – usually best displayed in
a table
− Raw data displayed in a graphic form is still just raw data! (so only use it if it facilitates
understanding, avoid repetition for the sake of adding a graph)
TRANSFORMING & PROCESSING DATA
− Overview of processing should be very short and simply indicate what you did to process
the data to facilitate interpretation – state what statistical test was performed and why
− No need to include sample calculations (but not forbidden if it’s really important)
− STATISTICS (specific tests may or may not be applicable, but some of these or other
appropriate ones have to be present):
 Descriptive: mean, median, mode, % change/difference
 Treatment of error: range, min/max value, standard deviation, significance of
error
 Statistical tests: t-test/ANOVA/correlation coefficient/χ2/etc. (depends on data
collected)
 If you are using a statistical tool which contains its own “internal” hypothesis,
make sure not to mistake or confuse that one with your RQ hypothesis – these
are two completely different things!
PRESENTING (PROCESSED) DATA
− Tables:
 Numbered in sequence and with a precisely labeled title
 Well designed and clear – all rows & columns must have headers, units must be
given (uncertainties can be given in the title row, underneath the table or as
footnotes, as applicable), decimal places must be consistent
 Try to avoid splitting a table between pages if possible, if not – make sure the
title row carries over and it’s clear that it’s the same table continued
− Graphs:
 Carefully chosen type to best and most clearly display the trends in data
 Numbered in sequence and with a precisely labeled title
 Axes are labeled and units given (uncertainties can be given underneath the
table or as footnotes)
− All tables and graphs should be described/explained in the text body (preferably
introduced before they are presented), not just put in as stand-alones
4. RESULTS
Use your knowledge to interpret the data from your experiment – what do the results mean, what do they tell
you? Put the numbers into words.
 Accurately interpret data and explain results using correct scientific reasoning.
•
Summarize (briefly describe) what the data (already presented in tables and/or graphs) you
observed during the experiment mean – “put the numbers into words”
Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria)
3/6
Page 8
IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES
5. CONCLUSION
Compare the RQ and/or the original hypothesis to the results obtained. Does the data help you answer your RQ?
Is the hypothesis supported by the evidence or not? How do the results provide evidence for your conclusion?
Give your results context – state whether the they answer your research question or not & use the results to
explain why (not). Are your results in line with other people’s research?
 Evaluate the validity of a hypothesis based on the outcome of a scientific investigation.
•
•
•
•
Explain what your data really means in a broader context
If you wrote a hypothesis, restate it and discuss whether your data supports or rejects is, justify
your conclusion through the data obtained
Restate your RQ and discuss whether your data does/doesn’t answer it, justify your conclusion
through the data obtained
Put your results into accepted scientific context – use other published scientific papers on the
same topic and compare relevant results to yours
6. EVALUATION
What worked well during the experiment? Were there any mistakes in the lab set up or while performing the
lab? Was there anything else? Were the results clear enough or did it affect the results? How?
Is there something that could be done better or to get better results? Could there be another experimental
approach for testing the same hypothesis?
 Evaluate the validity of the method based on the outcome of a scientific investigation.
 Explain improvements or extensions to the method that would benefit the scientific investigation.
•
•
•
•
State the strengths and limitations/weaknesses of your lab design, discuss why
Mention any potential preliminary trials conducted and/or modifications to your experiment
Offer realistic suggestions that would improve the limitations/weaknesses you identified
Provide ideas on what else can be done to further expand the understanding of the matter you
researched
7. BIBLIOGRAPHY
List all textbooks, scientific papers, etc. you used (quoted) during any of the steps (APA citation format
suggested, but any is ok as long as it is consistent).
 List all the sources you used in your research.
•
•
Best if listed at the very end
Keep consistent about the formatting style (APA recommended for science, MLA is also
acceptable, as is any other you choose – so long as it is the same throughout the write-up)
Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria)
4/6
Page 9
IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES
MARKING CRITERIA UNPACKED
Levels of performance are described using multiple indicators per level. In many cases the indicators occur together
in a specific level, but not always. Also, not all indicators are always present. This means that a candidate can
demonstrate performances that fit into different levels. To accommodate this, the IB assessment models use
markbands and advise examiners and teachers to use a best-fit approach in deciding the appropriate mark for a
particular criterion. The indicators per level per criterion can be found in Table 1 on the last page.
Additional guidance (what is marked) per criterion:
Personal Engagement (2):
Individuality, originality, creativity in experiment design, personal interest, independent thinking & research
• A statement of purpose
• The relationship with the real
world
• The originality of the design of
the method (choice of materials
and methods)
• Evidence of trial runs
• The difficulty of collecting data
(evidence of tenacity)
• The quality of the observations
made
• The care in the selection of
techniques to process the data
• The reflections on the quality of
the data
• The type of material referred to
in the background or in the
discussion of the results
• The depth of understanding of
the limitations in the
investigation
• The reflections on the
improvement and extension of
the investigation.
Exploration (6):
Workable method, focus on the problem, sufficient data, health /safety /ethical /environmental considerations
• The protocol for collecting the
data
• The range and intervals of the
independent variable (where
applicable)
• The selection of measuring
instruments (where relevant)
• Techniques to ensure adequate
control (fair testing)
• The use of control experiments
• The quantity of data collected,
given the nature of the system
investigated
• The type of data collected
• Provision for qualitative
observations
Safety/Ethical/Environmental
Issues:
• Evidence of a risk assessment,
even if the investigation is
considered “safe”.
• An appreciation of the safe
handling of chemicals or
equipment (e.g. the use of
protective clothing and eye
protection for labs, appropriate
gear for a given sport)
• An appreciation of the particular
safety issues in consideration to
some sports
• Consideration of basic hygiene
• The application of the IB animal
experimentation policy
• A reasonable use of materials
• The use of consent & PAR-Q
forms, and a consideration of
the welfare of the volunteers
• The correct disposal of waste
• Attempts to minimize the
impact of the investigation on
the environment
Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) – source unknown
5/6
Page 10
IBDP G4 Internal Assessment: INDIVIDUAL INVESTIGATION DESIGN & WRITE-UP GUIDELINES
Analysis (6):
Recording & processing, treatment of data, analysis of processed data
• Carefully selected
appropriate statistical tool(s)
to identify trends in the data
• Calculations carried out with
precision
Evaluation (6):
Conclusion, identification of strengths & weaknesses, improvements & extensions, variability & significance of
data
• A conclusion that is
or implicit or it might refer to
• The evaluation of the relative
supported by the data
specific parts that worked
impact of a weakness on the
• A conclusion that refers back
well or data that was
conclusion
to the research question (and
consistent
• Sensible, realistic
hypothesis, where
• Discussion of the reliability or
improvements (with details)
applicable)
the data
• Realistic extensions that
• An explanation based upon a
• Identified weaknesses in the
clearly follow on from the
scientific context
method and materials
investigation
• A discussion of the strengths
– this might be quite general
Communication (4):
Subject-specific vocabulary, correct format, graphs & tables quality and labeling, consistency, units & recording of
errors, logical and easy to read, consistent referencing
• The use of whole pages for
titles is not necessary
• Table of contents is not
necessary
• No need for blank data tables
presented at the end of the
method section
• There is often no need for a
raw data table as well as a
table with processed data
(especially if raw data is very
excessive)
• Raw data relegated to the
appendix when there was no
reason for it. This upsets the
flow of the report
• Clear and purposeful data
table & graph headers
• Avoid splitting a table over
two pages, or having a title
on one page and the table
(or graph) on the next page
• Avoid multiple graphs drawn
when they could have been
combined
• Make sure the sizes of graphs
are appropriate and easily
readable
• Skipping bibliography,
footnotes, endnotes or intext citation missing will lead
to suspected plagiarism
• References with an
incomplete format (URL
alone is not enough)
• Adhere by proper scientific
vocabulary
Ms. Blanka Vrgoc (Adapted from MYP & DP/IA Science Criteria) – source unknown
6/6
Page 11
Guidelines for the use of animals in IB World Schools
Why have guidelines for use of animals in the classroom?
As respect for animals is a fundamental stepping stone in the development of respect for fellow human beings the
IB animal guidelines seek to set out the parameters for the acceptable inclusion of animals in an IB World School.
What do the guidelines apply to?
These guidelines apply to the treatment of all animals in IB World Schools, to all students at all levels
including PYP, MYP, DP and IBCC whether assessed or non- assessed, for extended essays, the group 4
project and the MYP project. The Guidelines cover any work, be it in classrooms or school laboratories, or in the
general environment, that is anywhere where IB students may be working. The Guidelines apply to:
1.
2.
3.
Keeping animals in schools
Animal Experimentation
The use of human subjects in investigations
The Guidelines
Keeping live animals in the classroom
Caring for classroom pets can provide a variety of authentic learning contexts for students at almost every level.
It presents opportunities for students to develop compassion and empathy towards other living things and take
action as a result of this learning. Ultimately the decision to care for a live animal lies with the classroom teacher
and time should be taken to adequately research the animal and determine a suitable diet, housing, exercise
and socialization for the animal as well as how its care fits into the curriculum. The following should be carefully
considered before committing to the care of a classroom pet:





Student sensitivity or allergies to particular species, their food or bedding materials
Type of animal (domestic rather than wild, not venomous or vicious, diurnal rather than nocturnal etc)
Arrangements for housing the animal safely, comfortably, cleanly and in a manner that is not disruptive to
the classroom environment
Arrangements for appropriate care of the animals over weekends and holidays
Long term care of the animal in cases where a future student is allergic or the animal can no longer live in
the classroom
Additionally, essential agreements should be established regarding when and how the animal is to interact with
students. These should ensure the health and safety for both students and the animal (e.g. students wash their
hands before and after handling).
The nature of the guidelines
IB animal experimentation guidelines may be more stringent than some local or national standards for
experimentation in schools. Our standards for work in schools should also be more stringent than those of
university and research and development committees as we are not carrying out essential, groundbreaking
research. Practical work in schools has other purposes such as reinforcing concepts and teaching practical skills
and techniques. Even in a practically based extended essay the work will not be fundamental, ground-breaking
research.
Live animals in experimentation
Any planned and actual experimentation involving live animals must be subject to approval by the teacher
following a discussion between teacher and student(s) based on the IB guidelines. This discussion should look
at the 3Rs principle and the decision justified. The principles are:


Replacement
Refinement
© International Baccalaureate Organization 2015
International Baccalaureate® | Baccalauréat International® | Bachillerato
Internacional®
Page 12

Reduction
Any investigation involving animals should initially consider the replacement of animals with cells or tissues,
plants or computer simulations. If the animal is essential to the investigation refinements to the investigation to
alleviate any distress to the animal and a reduction in the numbers of animals involved should be made.
Experiments involving animals must be based on observing and measuring aspects of natural animal behaviour.
Any experimentation should not result in any cruelty to any animal, vertebrate or invertebrate. Therefore
experiments that administer drugs or medicines or manipulate the environment or diet beyond that which can be
regarded as humane is unacceptable in IB schools.
Animal dissection
There is no requirement in the PYP, MYP or in the DP group 4 sciences for students to witness or carry out a
dissection of any animal, vertebrate or invertebrate. If teachers believe that it is an important educational
experience and wish to include dissections in their scheme of work they must apply the following guidelines.
The IB does not support animal dissection or the use of animal body parts in the PYP.





Discuss reasons for dissections of whole animals with the students.
Allow any student who wishes to opt out of the dissection to do so.
Seek to reduce the number of dissections.
Seek to replace animal dissection with computer simulations and/or use animal tissue, for example,
hearts and lungs obtained from butchers, abattoirs or laboratory suppliers.
Dissect animals obtained from an ethical source only, for example, no wild animals, animals killed on
the road or endangered animals.
Experiments involving human subjects
Any experimentation involving human subjects must be with their direct, legally obtained written permission and
must follow the above guidelines. In addition, the investigation must not use human subjects under the age of 16
without the written consent of the parents or guardians.




Subjects must provide written consent
The results of the investigation must be anonymous
Subjects must participate of their own free will
Subjects have the right to withdraw from the investigation at any time.
Investigations involving any body fluids must not be performed due to the risk of the transmission of blood-borne
pathogens. An exception would be an investigator using their own saliva or sweat.
The use of secondary data
Secondary data acquired as a result of research that would not be in line with the above policy may be used
under certain circumstances:


Data acquired by professional researchers. In this case the data would be from research which is written
up in academic journals and qualifies as ground breaking. Such research would have been presented to
research committees for approval and be licensed.
Research which was considered ethical at the time the research was conducted. Our view of animals and
their welfare has moved on considerably in recent years. Much research conducted in a different culture
would not be granted permission today even though at the time, it was considered acceptable. Data from
such sources is acceptable.
Some secondary data exists that was considered unethical even within the cultural and historical context of the
day. Such data is not acceptable under any circumstances.
What happens if the guidelines are not followed?
Internal assessment moderators or extended essay examiners who see evidence that the guidelines are not
being followed at the school, in the sample work sent for moderation or in extended essays are required to
complete a problem report form (PRF) to be submitted to IB Cardiff.
Page 2 / 2
© International Baccalaureate Organization 2015
Page 13
Data presentation in biology
These guidelines are for HL and SL students for writing up their investigations whether they are assessed or not. The
outlines are not prescriptive, but are there to help students produce clear and easy to interpret presentations of
their work.
Units
The international system of units should be used wherever possible, although the main consideration is that units
should be fit for purpose. It is, for example, preferable to use minutes rather than seconds in some instances such as
when assessing the effect of exercise on heart rate or the rate of transpiration, or cm3 rather than m3 for depicting
the volume of carbon dioxide produced by respiring yeast cells. Non metric units such as inches or cups should not
be used.
Tables
Tables are designed to lay out the data ready for analysis. The table should have an explanatory title. “Table of
results” is not an explanatory title, whereas “Table to show the time taken to produce 1 cm3 of oxygen at different
concentrations of carbon dioxide by Elodea” describes the nature of the data collected. Other points to note are:
•
•
•
•
•
•
units should only appear in cell headings rather than in the body of the table
error for the instrument used or the accuracy of the reading should appear in the cell heading if relevant
the independent variable should be in the first column
subsequent columns should show the results for the dependent variable
decimal places should be consistent throughout a column
mean values should not have more decimal places than the raw data used to produce them.
The methods used to process the data should be easy to follow and the processed data may be included in the same
table as the raw data, there is no need to separate them.
Graphs
Graphs should be clear, easy to read and interpret with an explanatory title. If IT software is used, the graph should
have clearly identifiable data points and demarcated and labelled axes of a suitable scale.
Adjacent data points should be joined by a straight line and the line should start with the first data point and end
with the last one, as there should be no extrapolation beyond these points. Lines of best fit are only useful if there is
good reason to believe that intermediate points fall on the line between two data points. The usual reason for this is
the collection of a large amount of data, which is often not possible given the time constraints of investigations at
this level. Likewise, extrapolation of the line will only make sense if there is a large amount of data and a line of best
fit is predicted or there is reference made to the literature values. Students should exercise caution when making
assumptions.
Finally, the type of graph chosen should be appropriate to the nature of the data collected.
Error
There are sources of error at a number of stages of any investigation. The chosen method should try to address as
many as possible by considering the control of variables, but despite this, many will remain. Students should not be
discouraged by this, experimental results are only samples (see NOS section 3, “The objectivity of science” in the
Biology guide), but rather take them into consideration when analysing the data and drawing conclusions. A
thorough evaluation of the sources of uncertainty and error will also help to gain perspective on the investigation in
general and to suggest potential improvements and extensions.
Random variation and normal variation
Page 14
In biological investigations, errors can be caused by changes in the material used or by changes in the conditions
under which the experiment is carried out. Biological materials are particularly variable. For example, the water
potential of potato tissue may be calculated by soaking pieces of tissue in a range of concentrations of sucrose
solutions. However, the pieces of tissue will vary in their water potential, especially if they have been taken from
different potatoes. Pieces of tissue taken from the same potato will also show variations in water potential, but they
will probably show a normal variation that is less than that from samples taken from different potatoes. Random
errors can, therefore, be kept to a minimum by careful selection of material and by careful control of variables. For
example, use of a water bath to reduce the random fluctuations in ambient temperature.
Human errors
Making mistakes is not an acceptable source of error if they could have been easily avoided with more due care and
attention. Data loggers can be used if a large number of measurements need to be made, to avoid errors arising due
to loss of concentration. Careful planning can help reduce this risk.
The act of measuring
When a measurement is taken, this can affect the environment of the experiment. For example, when a cold
thermometer is put into a test tube with only a small volume of warm water in it, the water will be cooled by the
presence of the thermometer so it would be sensible to scale up the volume or have the thermometer in the
solution from the start. If the behaviour of animals is being recorded, the presence of the experimenter may
influence the animals’ behaviour. Although there are ways to reduce the impact of observer influences, it may have
to be something that is taken into account later.
Systematic errors
Systematic errors can be reduced if equipment is regularly checked or calibrated to ensure that it is functioning
correctly. For example, a thermometer should be placed in an electronic water bath to check that the thermostat of
the water bath is correctly adjusted. A blank should be used to calibrate a colorimeter to compensate for the drift of
the instrument.
Degrees of precision and uncertainty in data
Students must choose an appropriate instrument for measuring such things as length, volume, pH and light
intensity. This does not mean that every piece of equipment needs to be justified, and it can be appreciated that, in
a normal science laboratory, the most appropriate instrument may not be available.
For the degrees of precision, the simplest rule is that the degree of precision is plus or minus (±) the smallest division
on the instrument (the least count). This is true for rulers and instruments with digital displays.
The instrument limit of error is usually no greater than the least count and is often a fraction of the least count
value. For example, a burette or a mercury thermometer is often read to half of the least count division. This would
mean that a burette value of 34.1 cm3 becomes 34.10 cm3 (±0.05 cm3). Note that the volume value is now cited to
one extra decimal place so as to be consistent with the uncertainty.
The estimated uncertainty takes into account the concepts of least count and instrument limit of error, but also,
where relevant, higher levels of uncertainty as indicated by an instrument manufacturer which is usually obtainable
online, or qualitative considerations such as parallax problems in reading a thermometer scale, reaction time in
starting and stopping a timer, or random fluctuation in an electronic balance read-out. Students should do their best
to quantify these observations into the estimated uncertainty.
Other protocols exist and no specific protocol is preferred as long as it is clear that recording of uncertainties has
been undertaken and the uncertainties are of a sensible and consistent magnitude.
Propagating errors
Page 15
Propagating errors during data processing is not expected but it is accepted provided the basis of the experimental
error is explained.
Replicates and samples
Biological systems, because of their complexity and normal variability, require replicate observations and multiple
samples of material. As a rule of thumb, the lower limit is five measurements within the independent variable, with
three runs for each. This will produce five data points for analysis. So in an investigation into the effect of
temperature on the rate of reaction of an enzyme, temperature is the independent variable (IV) and the rate of
reaction the dependent variable (DV). The IV would need to be assessed three times at five different temperatures
at the very least. Obviously, this will vary within the limits of the time available for an investigation. Some simple
investigations permit a large number of measurements, or a large number of runs. It is also possible to use class
data to generate sufficient replicates to permit adequate processing of the data in class, non-assessed practical
work.
The standard deviation is the spread of the data around the mean. The larger the standard deviation the wider the
spread of data is. Standard deviation is used for normally distributed data. This makes it useful for showing the
general variation/uncertainty around a point on a line graph, but it is less helpful for identifying potential anomalies.
Error bars that plot the highest and the lowest value for a test, joined up through the mean that will form the data
point plotted on the graph with a vertical line, will allow the variation/uncertainty for each data set to be assessed. If
the error bars are particularly large, then it may show that the readings taken are unreliable (although reference to
the scale might be needed to determine what large actually is). If the error bars overlap with the error bar of a
previous or subsequent point, then it would show that the spread of data is too wide to allow for effective
discrimination. If trend lines are possible, then adding the coefficient of determination (R2) can be helpful as an
indication of how well the trend line fits the data.
Statistics
An effective presentation of the data goes a long way to assessing whether or not a trend is emerging. This is,
however, not the same as using statistics to assess the nature of such a trend and whether it is significant—in other
words, whether a trend, judged subjectively from a graph, is actually valid. Students are encouraged to use a
statistical test to assess their data, but should briefly explain their choice of test, outline the working hypothesis and
put the results of the test into the context of their investigation. For statistical tests the correct protocol should be
presented including null and alternative hypotheses, degrees of freedom, critical values and probability levels.
intro2r (http://www.intro2r.info/)
Updated 2018-10-01
Page 16
 (mailto:simon.queenborough@yale.edu)  (https://twitter.com/saqueenborough)
 (https://github.com/intro2r)
Choosing a Statistical Test
Statistical tests are just tools. Using the correct tool for a specific job is much easier, fun, and useful than using the wrong tool. Learning how to select the correct tool
takes practice. Sometimes several different tools could be used and address slightly different questions of nuances to the same question. In some cases there is no
single perfect tool and we must settle for an imperfect one, understanding its limitations.
Statistics is an area of active research and development, with new tools being developed and tested. Statisticians often disagree about these new tools and how useful
they are.
There are several ways to approach thinking about what test is most appropriate.
The following questions should be useful in guiding what tests are better or worse for your question.
1. What is your (statistical) objective?
There are various ways of using statistics.
The following list moves from the easiest to the hardest, practically, computationally, and philosophically.
Description Describes and characterizes populations and samples using descriptive statistics, graphs, and maps (e.g., opinion surveys, polls, population census)
Classification Classifying, identifying and categorizing a sample based on its characters uses descriptive statistics, graphics, and multivariate techniques (e.g.,
drug effectiveness).
Comparison Looks for differences between populations, samples, or reference values using ANOVA and similar tests (e.g, species identification and description,
identifying criminals and terrorists).
Prediction We can predict future measurements using regression, time series analysis or spatial interpolation (e.g., weather, election results, student success).
Explanation Here we look for the most important drivers of variation in the data to try and understand what is going on (a lot of academic research … e.g., does
climate change affect species phenology? does conservation intervention x actually work?).
2. How many variables do you have?
Do you have one variable, two, more than two, or a lot?
3. What kind of data are they?
Discrete
Discrete data can only take particular values, with no grey area in between.
Categorical/nominal counts or frequencies of things in two or more groups/categories, with no intrinsic ordering to these categories (e.g., eye colour, gender,
species).
Ordinal counts or frequencies of things in two or more groups/categories, but with intrinsic ordering to these categories (e.g., level of education: elementary school,
high school, college, post-graduate).
Interval counts or frequencies of things in two or more groups/categories, with intrinsic ordering to these categories, and where the spacing between categories is
equal (e.g., equal-sized groups of age: 0-9, 10-19, 20-29, …; income, etc.).
Integer Numeric data can be discrete if we are counting whole things (e.g., the number of apples), even if there is potentially an infinite number of these things.
Continuous
Continuous data are not restricted to any particular set of values. Continuous data can take any value over a continuous range. These data are always essentially
numeric (e.g., height, weight, length).
Continuous data can be treated as discrete by binning, or putting each value into a specific category that encompasses a range of data. This data is then interval data
(see above).
4. Is there a distinction between dependent and independent
variables?
Is there some a priori reason why you think that one variable has a direct effect on the other? If you are trying to predict or explain something, you are assuming some
causality.
5. Are the samples autocorrelated?
Observations may be correlated with each other in some way (i.e., autocorrelated).
Time if you take repeated measurements of the same thing, e.g., a persons height every year, the temperature every hour, the DBH of a tree every month.
Space if you measure the distribution of nutrients or minerals in soil, household income throughout a town.
In these two cases, temporal or spatial autocorrelation has little-to-no effect on the coefficient estimates in your statistical models, but it will affect the variance (standard
deviation, standard errors) that are calculated, and therefore also any p-values and statistical significance.
Sequence if you use the same equipment to take measurements and that equipment slowly drifts out of calibration, or if you take light measurements at the forest
floor as the sun comes up.
/
1.. Based on the five questions
Page 17
2.. Based on the data
Source (https://statswithcats.wordpress.com/2010/08/27/the-right-tool-for-the-job/)
Source (http://www.efoza.com/postpic/2012/05/statistical-test-flow-chart_237102.jpg)
Page 18
Guidance for the use of the
internal assessment criteria
The internally assessed component of the course is divided into five sections. The sections are differently
weighted to emphasize the relative contribution of each aspect to the overall quality of the investigation.
Pers. eng.
Exploration
Analysis
Evaluation
Communication
Total
2 (8%)
6 (25%)
6 (25%)
6 (25%)
4 (17%)
24 (100%)
Each section aims to assess a different aspect of the student’s research abilities. As the investigations, and therefore
the approaches to the investigation, will be specific to each student, the marking criteria are not designed to be a
tick-chart markscheme and each section is meant to be seen within the context of the whole. As such, a certain
degree of interpretation is inevitable. The following tips are designed to help focus on the intention of each section,
rather than be seen as a definitive approach.
Once you’ve completed your IA write-up, go through this handout and self-assess your work.
Suggestion: underline the descriptor components you think you have successfully reached/completed, and circle the
final mark with the best-fit approach.
Personal engagement
The emphasis within this section is on individuality and creativity within the investigation. The question to ask is, has
the chosen research question been devised as a result of the personal experience of the student? The question
could be a result of observations made in the student’s own environment or ideas that the student has had as the
result of learning, reading or experimenting in class. The investigation does not have to be ground-breaking
research, but there should be an indication that independent thought has been put into the choice of topic, the
method of inquiry and the presentation of the findings. The topic chosen should also be of suitable complexity. If
the research question is very basic or the answer self-evident then there is little opportunity to gain full marks for
exploration and analysis as the student will not have the opportunity to demonstrate his or her skills.
This criterion assesses the extent to which the student engages with the exploration and makes it their own. Personal
engagement may be recognized in different attributes and skills. These could include addressing personal interests or
showing evidence of independent thinking, creativity or initiative in the designing, implementation or presentation of
the investigation.
Mark Descriptor
0
• The student’s report does not reach a standard described by the descriptors below.
1
• The evidence of personal engagement with the exploration is limited with little independent thinking,
initiative or creativity.
• The justification given for choosing the research question and/or the topic under investigation does not
demonstrate personal significance, interest or curiosity.
• There is little evidence of personal input and initiative in the designing, implementation or presentation
of the investigation.
2
• The evidence of personal engagement with the exploration is clear with significant independent
thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under investigation
demonstrates personal significance, interest or curiosity.
• There is evidence of personal input and initiative in the designing, implementation or presentation of the
investigation.
Exploration
Page 19
The issue here is the overall methodology. Students need to take their individual ideas and translate them into a
workable method. Students must also demonstrate the thinking behind their ideas using their subject knowledge.
The information given must be targeted at the problem rather than being a general account of the topic matter, in
order to demonstrate focus on the issues at hand.
What needs to be seen is a precise line of investigation that can be assessed using scientific protocols. It is then
expected that the student gives the necessary details of the method in terms of variables, controls and the nature
of the data that is to be generated. This data must be of sufficient quantity and treatable in an appropriate manner,
so that it can generate a conclusion, in order to fulfill the criteria of analysis and evaluation. If the method devised
does not lead to sufficient and appropriate data, this will lead to the student being penalized in subsequent sections
where this becomes the crux of the assessment.
Health and safety is a key consideration in experimental work and forms part of a good method. If the student is
working with animals or tissue, it is reasonable to expect there to be evidence that the guidelines for the use of
animals in IB World Schools have been read and adhered to. The use of human subjects in experiments is also
covered by this policy. If the student is working with chemicals, some explanation of safe handling and disposal
would be expected. Full awareness is when all potential hazards have been identified, with a brief outline given as to
how they will be addressed. It is only acceptable for there to be no evidence of a risk assessment if the investigation
is evidently risk-free—such as in investigations where a database or simulation has been used to generate the data.
This criterion assesses the extent to which the student establishes the scientific context for the work, states a clear and
focused research question and uses concepts and techniques appropriate to the DP level. Where appropriate, this
criterion also assesses awareness of safety, environmental, and ethical considerations.
Mark Descriptor
0
• The student’s report does not reach a standard described by the descriptors below.
1–2
• The topic of the investigation is identified and a research question of some relevance is stated but it is not
focused.
• The background information provided for the investigation is superficial or of limited relevance and does
not aid the understanding of the context of the investigation.
• The methodology of the investigation is only appropriate to address the research question to a very
limited extent since it takes into consideration few of the significant factors that may influence the
relevance, reliability and sufficiency of the collected data.
• The report shows evidence of limited awareness of the significant safety, ethical or environmental issues
that are relevant to the methodology of the investigation*.
3–4
• The topic of the investigation is identified and a relevant but not fully focused research question is
described.
• The background information provided for the investigation is mainly appropriate and relevant and aids
the understanding of the context of the investigation.
• The methodology of the investigation is mainly appropriate to address the research question but has
limitations since it takes into consideration only some of the significant factors that may influence the
relevance, reliability and sufficiency of the collected data.
• The report shows evidence of some awareness of the significant safety, ethical or environmental issues
that are relevant to the methodology of the investigation*.
5–6
• The topic of the investigation is identified and a relevant and fully focused research question is clearly
described.
• The background information provided for the investigation is entirely appropriate and relevant and
enhances the understanding of the context of the investigation.
• The methodology of the investigation is highly appropriate to address the research question because it
takes into consideration all, or nearly all, of the significant factors that may influence the relevance,
reliability and sufficiency of the collected data.
• The report shows evidence of full awareness of the significant safety, ethical or environmental issues that
are relevant to the methodology of the investigation*.
Analysis
Page 20
At the root of this section is the data generated and how it is processed. If there is insufficient data then any
treatment will be superficial. It is hoped that a student would recognize such a lack and revisit the method before
the analysis is arrived at. Alternatively, the use of databases or simulations to provide sufficient material for analysis
could help in such situations.
Any treatment of the data must be appropriate to the focus of the investigation in an attempt to answer the
research question. The conclusions drawn must be based on the evidence obtained from the data rather than on
assumptions. Given the scope of the internal assessment and the time allocated, it is more than likely that variability
in the data will lead to a tentative conclusion. This should be recognized and the extent of the variability considered.
The variability should be demonstrated and explained and its impact on the conclusion fully acknowledged. It is
important to note that, in this criterion, the word “conclusion” refers to a deduction based on direct interpretation
of the data, which is based on asking questions such as: What does the graph show? Does any statistical test used
support the conclusion?
This criterion assesses the extent to which the student’s report provides evidence that the student has selected,
recorded, processed and interpreted the data in ways that are relevant to the research question and can support a
conclusion.
Mark Descriptor
0
• The student’s report does not reach a standard described by the descriptors below.
1–2
• The report includes insufficient relevant raw data to support a valid conclusion to the research question.
• Some basic data processing is carried out but is either too inaccurate or too insufficient to lead to a valid
conclusion.
• The report shows evidence of little consideration of the impact of measurement uncertainty on the
analysis.
• The processed data is incorrectly or insufficiently interpreted so that the conclusion is invalid or very
incomplete.
3–4
• The report includes relevant but incomplete quantitative and qualitative raw data that could support a
simple or partially valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out that could lead to a broadly valid conclusion but
there are significant inaccuracies and inconsistencies in the processing.
• The report shows evidence of some consideration of the impact of measurement uncertainty on the
analysis.
• The processed data is interpreted so that a broadly valid but incomplete or limited conclusion to the
research question can be deduced.
5–6
• The report includes sufficient relevant quantitative and qualitative raw data that could support a detailed
and valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out with the accuracy required to enable a
conclusion to the research question to be drawn that is fully consistent with the experimental data.
• The report shows evidence of full and appropriate consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is correctly interpreted so that a completely valid and detailed conclusion to the
research question can be deduced.
Evaluation
Page 21
Although it may appear that the student is asked to repeat the analysis of the data and the drawing of a conclusion
again in the evaluation, the focus is different. Once again the data and conclusion come under scrutiny but, in the
evaluation, the conclusion is placed into the context of the research question. So, in the analysis, it may be
concluded that there is a positive correlation between x and y; in the evaluation, the student is expected to put this
conclusion into the context of the original aim. In other words, does the conclusion support the student’s original
thinking in the topic? If not, a consideration of why it does not will lead into an evaluation of the limitations of the
method and suggestions as to how the method and approach could be adjusted to generate data that could help
draw a firmer conclusion. Variability of the data may well be mentioned again in the evaluation as this provides
evidence for the reliability of the conclusion. This will also lead into an assessment of the limitations of the method.
It is the focus on the limitations that is at issue in the evaluation, rather than a reiteration that there is variability.
This criterion assesses the extent to which the student’s report provides evidence of evaluation of the investigation and
the results with regard to the research question and the accepted scientific context.
Mark Descriptor
0
• The student’s report does not reach a standard described by the descriptors below.
1–2
• A conclusion is outlined which is not relevant to the research question or is not supported by the data
presented.
• The conclusion makes superficial comparison to the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are
outlined but are restricted to an account of the practical or procedural issues faced.
• The student has outlined very few realistic and relevant suggestions for the improvement and extension
of the investigation.
3–4
• A conclusion is described which is relevant to the research question and supported by the data presented.
• A conclusion is described which makes some relevant comparison to the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are
described and provide evidence of some awareness of the methodological issues* involved in
establishing the conclusion.
• The student has described some realistic and relevant suggestions for the improvement and extension of
the investigation.
5–6
• A detailed conclusion is described and justified which is entirely relevant to the research question and
fully supported by the data presented.
• A conclusion is correctly described and justified through relevant comparison to the accepted scientific
context.
• Strengths and weaknesses of the investigation, such as limitations of the data and sources of error, are
discussed and provide evidence of a clear understanding of the methodological issues* involved in
establishing the conclusion.
• The student has discussed realistic and relevant suggestions for the improvement and extension of the
investigation.
Communication
Page 22
The marking points for communication take the entire write-up into consideration. If a report is clearly written and
logically presented there should be no need for the teacher to re-read it. The information and explanations should
be targeted at the question in hand rather than being a general exposition of the subject area; in other words, the
report should be focused. The vocabulary should be subject-specific and of a quality appropriate to diploma level.
The subject-specific conventions that can be expected are the correct formats for graph and tables and cell
headings, correct use of units and the recording of errors. This is not to say that the presentation needs to be
faultless to gain full marks. Minor errors are acceptable as long as they do not have a significant bearing on
understanding or the interpretation of the results.
This criterion assesses whether the investigation is presented and reported in a way that supports effective
communication of the focus, process and outcomes.
Mark Descriptor
0
• The student’s report does not reach a standard described by the descriptors below.
1–2
• The presentation of the investigation is unclear, making it difficult to understand the focus, process and
outcomes.
• The report is not well structured and is unclear: the necessary information on focus, process and
outcomes is missing or is presented in an incoherent or disorganized way.
• The understanding of the focus, process and outcomes of the investigation is obscured by the presence
of inappropriate or irrelevant information.
• There are many errors in the use of subject-specific terminology and conventions*.
3–4
• The presentation of the investigation is clear. Any errors do not hamper understanding of the focus,
process and outcomes.
• The report is well structured and clear: the necessary information on focus, process and outcomes is
present and presented in a coherent way.
• The report is relevant and concise thereby facilitating a ready understanding of the focus, process and
outcomes of the investigation.
• The use of subject-specific terminology and conventions is appropriate and correct. Any errors do not
hamper understanding.
Page 23
Investigation 1
Equipment
ͻ 10 Petri dishes
ͻ 100g of "Yates premium quality" potting mix
ͻ 5.00g of hay
ͻ 5.00g of Eucalyptus leaves
ͻ 5.00g of grass
ͻ Electronic weighing scale (±0.01g)
ͻ 100 seeds of E. pilularis that are 2.00 mm in diameter (±0.5mm)
ͻ 10.0cm ruler (±0.5mm)
ͻ 100ml of de-ionized water to create the smoke water
ͻ 100ml of de-ionized water to create the control
ͻ Tea strainer
ͻ 3 x 250ml graduated beaker (±0.4mL)
ͻ Matches
ͻ 2 Sand baths
ͻ 2 thermometers (±0.05°c)
International Baccalaureate
A study on the effect of smoke water on the germination and growth
o f Eucalyptus pilularis
Background
Australia is a country where bushfires are commonplace during the summer season, and these fires affect
much of Australia's flora. As a by-product of this, numerous native Australian plants that inhabit firedependent ecosystems have evolved reproductive strategies to adapt to factors associated with fire. These
adaptations that affect their germination can be classified as either physical (derived from the immense
heat of the bushfire stimulating a seed to germinate) or chemical (derived from a combination of various
chemical elements produced by the smoke that stimulates germination).
Aim
The aim of this biology laboratory experiment is to explore the effects of smoke water, a mixture of water,
burnt plants and hay, and its effect on the germination and post germination growth Eucalyptus pilularis
seeds also known as gumnut or blackbutt, an Australian native plant which predominates in forests that are
frequently burned.
To create the smoke water
1. Place 5g each of the hay, grass and Eucalyptus leaves into one of the 250ml beaker.
2. Ignite the organic matter with a match so that they catch on fire. Let them burn until they are all
charred.
3. Measure 100ml of de-ionized water with the second 250ml beakers. Pour this water into the first
beaker with the leaves, hay and twigs and leave to infuse for 5 hours.
4. Strain the smoke water mixture into the third measuring beaker using the tea strainer, ensuring
that you are only left with the liquid remnants.
SAFETY Care should be taken when burning the organic matter, this should be carried out in a
ventilated area and the beakers should be made of heat resistance glass.
Research question
Does smoke water stimulate germination and post germination growth of Eucalyptus pilularis seeds
compared to de-ionized water?
Prediction
Smoke water will successfully germinate more Eucalyptus pilularis than de-ionized water, and thus, as a
result of this, the post germination growth of the Eucalyptus piluiaris seeds by the smoke water will be
more effective. Effectiveness, for this experiment, is defined as the height of the seedling that emerges
from the germinated gumnut seed. If the various chemicals, such as phosphorous and nitrogenous
compounds found in the smoky remnants of organic matter function as chemical triggers, then Eucalyptus
pilularis will begin its germination out of its dormant state. These phosphorous and nitrogenous
compounds, such as NaN03, KN03, NH4Cl and NH4N03, that are naturally occurring in organic matter, are not
found in de-ionized water (Dixon et al. 1995), and hence, smoke water is predicted to germinate a larger
number of seeds and grow more after germination than de-ionized water 1.
Germination and growth
1. Set the sand baths to 30 degrees Celsius and place a thermometer in each one to verify the
temperature setting.
2. Place 5 Petri dishes into one sand bath and the remaining 5 Petri dishes into another. One will be
our control and one will be our test.
3. Measure out 10 x 10.0g of the potting mix using the electronic weighing scale and place 10.0g into
each one of 10 Petri dishes. 5 dishes for smoke water treatment and 5 dishes for de-ionised water
treatment.
4. Sow 10 gumnuts into each Petri dish and submerge them into the potting mix at a consistent depth
of 0.5cm. Place the seeds towards the edges of the Petri dish so they can be observed through the
glass without having to disturb the seeds to observe them.
5. Water the control sand bath at 8:15am with 10ml of de-ionized or smoke water each day for
fourteen days.
6. After 14 days, count the number of seeds germinated (distinguished by the emergence of the
seedling) and measure the height of the emergent seedling in the test and the control groups with
the 10.0cm ruler. The seedling height is measured from the soil surface to the highest part of the
stem.
7. Repeat the set up once to ensure sufficient data.
Method
Preliminary experiment
The gumnut seeds were obtained from trees growing in local forestry plantations.
It was felt necessary to find out if the gumnut seeds would germinate or not.
1. 50 seeds were planted in 5 Petri dishes of potting mixture (10 seeds per dish).
2. Each dish was watered with 10 ml of de-ionised water and left for two weeks at room temperature.
3. At the end of the two weeks the numbers of seeds germinating was counted.
Results
Number of seeds germinating = 22/50
Percentage germination = 44%
The supply of seeds was considered viable enough to proceed with the experiment.
1
Investigation 1
Controlled Variables
ͻ The same volume (10ml) of liquid is added to each dish at the same time (8:15am) each day
throughout the 14 days.
ͻ All 100 E. pilularis seeds that were used in this experiment were kept within a size range of 2.00
mm in diameter
ͻ The water used to create the smoke water was de-ionized water like the control, which allowed
consistency between the control and the test groups.
http://anpsa.org.au/APOL2/jun96-6.htmI
2
1
Biology teacher support material
1
Biology teacher support material
2
Page 24
Investigation 1
ͻ
ͻ
ͻ
ͻ
Investigation 1
Number of seeds successfully germinated
The temperature of the seeds was kept constant at 30.0°C by the sand baths.
The potting mix for the seeds was from the same brand, "Yates premium potting mix" and the mass
of potting mix used for the seeds was kept constant at 10.0g.
Same amount of light was assumed to be received for each plant as the experiment was conducted
in the same location on the same days.
The seeds were placed at a depth of 0.5cm into the soil in the Petri dish.
In order to determine the number of seeds that were germinated successfully, the number of seeds that
showed distinct cracking of the seed coat and the emergence of the seedling for both the smoke water and
the de-ionized water test groups were counted and placed into the table below. The raw data is presented
in appendix A.
Water Type
Trial
Numbers germinated (/50)
Average
%
De-ionized
1
26
25
49
2
23
Smoked
1
43
44
88
2
45
The experiment continued for fourteen days to allow for sufficient time to gauge of the effect of the
different water types, the manipulated variable. Both sand baths set at the same temperature are placed
next to each other, as specified by the method, and they are assumed to be receiving equal amounts of
light. The potting mix was taken from the same batch, so all samples could be assumed to contain the same
ratio of ingredients. Furthermore, the E. pilularis was submerged into the potting mix at a consistent depth
of 0.5cm and towards the edges of the Petri dish to allow for observations to be made through the glass
without having to disrupt the seeds to observe them.
From the processed data that informs us about the number of seeds successfully germinated, we can
clearly see that smoke water germinates, on average.
Our method of data collection for this experiment is to count the seeds that successfully germinated from
the different Petri dishes in the control and test groups respectively, the measured variable. This is done by
observing through the side of the Petri dish whether the seed coat has broken and the seedling has
emerged. The other way to collect data in this experiment is to measure the height of the seedlings (from
the soil surface to the seedling tip) of the germinated seeds after the 14 days of the experiment. The
difference between smoke water and de-ionised water was determined using the F2 test for the
germination and the t-test for the growth of the seedlings.
Graph of de-ionised water seed germination
Percentage de-ionised
water gumnut seeds
germinated
Percentage de-ionised
water gumnut seeds NOT
germinated
Assumptions
ͻ The light is of the same intensity because the seeds will be set up side by side.
ͻ The de-ionized water contains the same impurities
ͻ The potting mix contains the same amount of its constituent components.
ͻ The impurities and chemical elements in the air will be the same for both sets of seeds.
ͻ The gumnut seeds are all composed of the same percentage of elements.
Graph of smoke water seed germination
Observations
ͻ The E. pilularis seeds were no bigger than 2mm, and were brownish black in colour. There
were no obvious signs of previous germination, or cracking of the outer seed coat.
ͻ The smoke water was clearly distinctive from the de-ionized water. The de-ionized water was
clear, as one would expect if it had been filtered. The smoke water, however, had a blackish,
straw coloured hue, due to its absorption of the remnants of the burnt organic matter.
ͻ Definite germination was seen on a lot more seeds with the smoke water than with the deionized water.
ͻ The E. pilularis subjected to smoke water germinated earlier on average than the seeds
subjected to de-ionized water. Seeds with smoke water started showing first signs of
germination as early as 7 days, when their seed coats started to split to allow the seedlings to
emerge. In comparison, the de-ionized watered seeds took up to 10 days to start showing
germination.
ͻ The E. pilularis that were germinated by the smoke water tended to have larger seedlings
emerging from the split seed coat.
ͻ The E. pilularis that were watered with the smoke water had significantly larger cracking of the
seed coat, allowing for more space for the seedlings to grow and extend outwards from the
shell.
ͻ The colour of the seedlings in both experiments was a distinct dark purple colour, and leaves
appeared only on the smoke water experiment, with a maximum of 2 small, juvenile leaves
found, measuring no more than approximately 50.0mm.
Percentage smoke water
gumnut seeds germinated
Percentage smoke water
gumnut seeds NOT
germinated
3
Biology teacher support material
4
3
Biology teacher support material
4
Page 25
Investigation 1
The effect of smoke water and de-ionized water on post germination growth
ȋ2 test
In order to see if there is a significant difference between the germination of the seeds treated with smoke
water and de-ionised water a ɍ2 test was carried out.
This section of the experiment is designed to test the effectiveness of gumnut seed germination, depending
on the type of water it received, either de-ionized or smoke water. Effectiveness was determined by the
height of the seedling that emerged from the seed coat of the germinated gumnut seeds. The higher the
seedling the more effective the water is on germination. The raw data is presented in appendix A.
Height of seedlings for germinated seeds
Overall average height Overall standard
Trial
Water Type
Trial
Trial average of
/mm ±0.5mm
deviation
seedling height /mm Standard
Deviation
±0.5mm
De-ionized
1
13.0
13.4
23.4
13.6
2
11.8
13.9
Smoked
1
57.8
24.5
59.5
12.4
2
61.1
22.3
Null Hypothesis: Smoke water does not affect germination of gumnut seeds
Alternative Hypothesis: Smoke water affects germination of gumnut seeds
Smoke water
88
12
100
Germinated
Not germinated
Column total
De-ionised water
49
51
100
Row total
137
63
200
Proportion of seed germinating = 137/200 = 68.5%
Proportion of seeds not germinating = 100 – 68.5 = 31.5%
On first observation of the processed data, it can be seen that smoked water clearly has a higher average
seedling height than the de-ionized water whilst also having a lower standard deviation. This indicated that
the smoked water seeds seedling grew higher than the de-ionized water. The error bars in the graph below
suggest that there may be a significant difference between the affects of the treatment on seedling growth.
However, the range of variation in the results as given by the standard deviations is large especially for the deionised water treatment trials. To verify this, a t-test was carried out on the data.
Expected number of smoke water treated seeds to germinate = 68.5% of 100 = 68.5
Expected number of de-ionised water treated seeds to germinate = 68.5% of 100 = 68.5
Expected number of smoke water treated seeds not to germination = 31.5% of 100 = 31.5
Expected number of de-ionised water treated seeds not to germinate = 31.5% of 100 = 31.5
Expected
frequency
Difference
Positive
difference
O
E
O-E
IO-EI
(IO-EI)2/E
88
49
12
51
68.5
68.5
31.5
31.5
19.5
-19.5
-19.5
19.5
19.5
19.5
19.5
19.5
ɍ2calc
5.55
5.55
12.07
12.07
35.25
The effect of smoke water on the growth of
gumnut (Eucalyptus pilularis) seedlings
Error bars = ±1 standard deviation
90
80
Average seedling length / mm
Observed
frequency
Investigation 1
70
Number of degrees of freedom = (rows – 1) x (columns – 1) = (2-1) x (2-1) = 1
60
ɍ2crit = 3.84 for p=0.05
ɍ2calc
50
ɍ2crit =
Since the test value for
= 35.25 is a lot greater than the critical value
3.84 we must reject the
Null Hypothesis and accept the Alternative Hypothesis. The test value is significant for p < 0.001
40
30
20
10
0
Smoke water
De-ionised water
Treatment
6
5
Biology teacher support material
5
Biology teacher support material
6
Page 26
Investigation 1
t-test
In order to statistically test whether the shoot of smoke water germinated gumnut seedlings grew more than the
de-ionized water, a two-tailed t-test for independent samples was carried out to investigate whether there is a
significant difference between the growth of the seedlings.
ͻ Null Hypothesis - the smoke water has no effect on post germination growth of the gumnut
seedlings.
ͻ Alternative Hypothesis - the smoke water does have an effect on post germination growth of the
gumnut seedlings.
Investigation 1
Conclusion
In conclusion, the experiment supported my hypothesis that smoke water will successfully germinate more
Eucalyptus pilularis than de-ionized water. Furthermore, the subsequent growth of the Eucalyptus pilularis seeds
by the smoke water was found to be more effective than the de-ionized water due to the significantly taller
seedlings of the Eucalyptus pilularis that were exposed to the smoke water. This could because the various
chemicals, such as phosphorous and nitrogenous compounds found in the smoky remnants of the burnt organic
matter (in my case, the burnt leaves, hay and twigs) acted as chemical triggers for the E. pilularis to begin its
germination out of its dormant state and stimulate its subsequent growth. While all of the active compounds in
smoke have not yet been identified, a large majority of the compounds present in the smoke water mixture
(NaN03, KN03, NH4CI and NH4N03) are water soluble, thus they are easily able to be taken in by the gumnut seed
and, once inside the seed, they are used as these so called "chemical triggers” to start germination. These
chemical triggers work by altering the levels of chemicals that the seed maintains in homeostasis, once the seed
has registered these differing levels of phosphorous and nitrogenous compounds, it stimulates the germination of
the seed. There are, however, compounds called butenolides that have confirmed germination-promoting action.
These butenolides are produced by some plants on exposure to high temperatures and smoke caused by bush
fires. In particular, botanists Flematti, Ghisalberti, Dixon and Trengove isolated a particular butenolide called 3methyl-2H-furo[2,3-c]pyran-2-one, which was found to trigger seed germination in plants whose reproduction is
fire-dependent, such as the E. pilularis used in my experiment 3. One theory about how this butenolide called 3methyl-2tf-furo[2,3-c@pyran-2-one is formed by the plant is given to us by Light, Berger and van Steden, who
hypothesized that this particular butenolide was created from cellulose within the plant, and this substance,
created by the cellulose, stimulated the seeds reproductive cycle, and hence, germination 4. The two pie graphs
that show the percentage of seeds germinated for the smoke water experiment and de-ionized water experiment
respectively, furthermore indicate that my hypothesis was correct, with 88% of the smoke watered seeds
successfully germinating compared to only 47% of the de-ionized water seeds germinating. This was backed up
ǁŝƚŚ ŵLJ ʖ2-test that accurately concluded that we could reject the null hypothesis, with a 95% degree of
confidence, that the smoke water successfully germinated more seeds that the de-ionized water. The t-test on the
seedling growth shows that the smoke water has a significant positive effect on the gumnut seedlings.
t-test formula:
degrees of freedom = n1 + n2 – 1 = 198
tcalc = 17.4
tcrit (p=0.05) = 1.97
Because our test t value tcalc = 17.4 is greater than the critical value tcrit =1.97 at p = 0.05, we can accept the
alternative hypothesis, that the smoke water significantly stimulates the growth of the gumnut seedlings
germinated. The test value is significant for p < 0.001
Evaluation of Weaknesses with suggested improvements
The potting mixture used was obtained from the local garden shop, and whilst the same brand and the same
amount of the potting mixture was used for both seeds in the experiment, the potting mixture may have
contained impurities which could potentially have enhanced or reduced the ability of the seeds to germinate,
especially because the Yates brand "Contains trace elements to add extra vital nutrients" 2. Some of the chemicals
from the smoke water also could have potentially reacted with some of the ingredients of the potting mix and
rendered them useless, however the seeds watered with de-ionized water may not have had this potential
problem. To improve this, I could have used a different support for the seeds such as cotton wool or filter paper.
Using different types of leaves, twigs and hay to create the smoke water would give you different chemicals, as
each has a differing composition of chemicals, some of which may be beneficial for germination, and some of
which wouldn't. For this experiment, I could have used only one variable like hay, instead of twigs and leaves as
well. This would narrow my scope of results down as well and I would potentially be able to pinpoint the specific
chemical, or source of the chemical, that allows gumnuts to germinate successfully. It may be found that twigs, for
example, don't enhance seed germination but leaves do. By singling out the element that best enhances seed
germination, further experiments could be carried out, and the exact chemical could be identified, that best
enhances the seeds germination.
Bibliography
Yates Gardening Ltd Sydney Australia http://www.yates.com.au/products/pots-and-potting-mix/all-purposepotting-mix/yates-premium-potting-mix/ Last visited July 10 2011
Gavin R. Flematti, Emilio L. Ghisalberti, Kingsley W. Dixon and Robert D. Trengove A Compound from Smoke That
Promotes Seed Germination http://www.sciencemag.org/content/305/5686/977 Science 13 August 2004:
Vol. 305 no. 5686 p. 977Published Online July 8 2004
Combined with this, I could have used gumnut seeds that were all the same weight rather than the same size in
diameter. I tried to use gumnut seeds that were only 2.00mm in diameter, however it would have been better
served to use seeds that all had a constant weight of 0.2g for example, as then I could have assumed that each
seed contained the same amounts and composition of nutrients, enzymes and other chemicals inside it.
Marnie E. Light, Barend V. Burger and Johannes van Staden Formation of a Seed Germination Promoter from
Carbohydrates and Amino Acids http://pubs.acs.org/doi/abs/10.1021/jf050710u J. Agric. Food Chem., 2005, 53
(15), pp 5936–5942 Publication Date (Web): July 1, 2005
To further narrow my scope of the experiment, I could have tested the effects of different concentrations of the
smoke water as well. Instead of only using a 1:10 ratio of 1 part twigs, hay and leaves to 10 parts de-ionized water,
I could have tested a ratio of 1:5 with 1 part twigs, hay and leaves and 5 parts de- ionized water. Working out the
optimum concentration of smoke water would help this experiment as better and clearer results could be
obtained.
3
2
http://www.yates.com.au/products/pots-and-potting-mix/all-purpose-potting-mix/yates-premium-potting-mix/
4
7
http://www.sciencemag.org/content/305/5686/977
http//pubs.acs.org/doi/abs/10.1021/jf050710u
8
Biology teacher support material
7
Biology teacher support material
8
Page 27
Investigation 1
Investigation 1
Seeds watered with De-ionized water (Trial 1 )
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Appendix A - raw data tables
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Seeds watered with Smoke Water (Trial 1)
Did the seed Germinate
Height of seedling in / mm ±0.5mm
Yes
56.0
Yes
71.0
Yes
73.0
Yes
67.0
Yes
54.0
No
0
Yes
58.0
Yes
70.0
Yes
66.0
Yes
61.0
Yes
64.0
Yes
71.0
No
0
No
0
Yes
59.0
Yes
67.0
Yes
58.0
Yes
63.0
Yes
62.0
Yes
64.0
Yes
72.0
Yes
75.0
No
0.0
Yes
68.0
Yes
64.0
Yes
69.0
Yes
70.0
No
0
Yes
52.0
No
0
Yes
79.0
Yes
81.0
Yes
83.0
Yes
74.0
Yes
74.0
Yes
78.0
Yes
63.0
Yes
69.0
Yes
58.0
Yes
70.0
Yes
68.0
Yes
62.0
Yes
63.0
Yes
68.0
Yes
58.0
Yes
81.0
Yes
68.0
Yes
73.0
Yes
67.0
No
0
Did the seed Germinate
Yes
Yes
Yes
No
No
No
Yes
No
Yes
No
Yes
No
No
Yes
Yes
No
Yes
No
Yes
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
No
Yes
No
Yes
No
No
Yes
Yes
No
Yes
Height of seedling in / mm ±0.5mm
18
27.0
19.0
0
0
0
24.0
0
25.0
0
28.0
0
0
17.0
23.0
0
16.0
0
26.0
27.0
15.0
0
0
27.0
0
21.0
22.0
0
27.0
37.0
0
0
26.0
31.0
0
0
27.0
41.0
0
0
0
25.0
0
19.0
0
0
37.0
22.0
0
25.0
9
10
Biology teacher support material
9
Biology teacher support material
10
Page 28
Investigation 1
Investigation 1
Seeds watered with De-Ionized water (Trial 2)
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Seeds watered with Smoke water (Trial 2)
Did the seed Germinate
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Height of seedling in mm / ±0.5mm
72.0
73.0
0
72.0
57.0
74.0
79.0
62.0
78.0
64.0
72.0
79.0
72.0
57.0
56.0
83.0
63.0
0
72.0
63.0
0
58.0
81.0
57.0
62.0
0
74.0
73.0
83.0
58.0
74.0
57.0
63.0
79.0
60.0
74.0
79.0
57.0
86.0
53.0
56.0
67.0
63.0
68.0
54.0
68.0
68.0
0
62.0
72.0
11
Biology teacher support material
Did the seed Germinate
No
Yes
Yes
Yes
No
No
No
No
Yes
No
Yes
No
No
Yes
Yes
No
No
No
Yes
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
No
Yes
No
Yes
No
No
Yes
Yes
No
No
Height of seedling in / mm ±0.5mm
0
26.0
21.0
23.0
0
0
0
0
31.0
0
14.0
0
0
16.0
18.0
0
0
0
26.0
31.0
25.0
0
0
21.0
0
31.0
26.0
0
23.0
36.0
0
0
14.0
23.0
0
0
23.0
27.0
0
0
0
24.0
0
45.0
0
0
42.0
23.0
0
0
12
11
Biology teacher support material
12
Page 29
Page 30
Investigation 1 (annotated)
A study on the effect of smoke water on the germination and growth
o f Eucalyptus pilularis
Background
Australia is a country where bushfires are commonplace during the summer season, and these fires affect
much of Australia's flora. As a by-product of this, numerous native Australian plants that inhabit firedependent ecosystems have evolved reproductive strategies to adapt to factors associated with fire. These
adaptations that affect their germination can be classified as either physical (derived from the immense
heat of the bushfire stimulating a seed to germinate) or chemical (derived from a combination of various
chemical elements produced by the smoke that stimulates germination).
Aim
The aim of this biology laboratory experiment is to explore the effects of smoke water, a mixture of water,
burnt plants and hay, and its effect on the germination and post germination growth Eucalyptus pilularis
seeds also known as gumnut or blackbutt, an Australian native plant which predominates in forests that are
frequently burned.
Comm: Overall the report is clear,
concise and logically structured.
Comm:Subject specific terminology
and notation are used throughout.
PE:Student shows a high degree of
engagement with the investigation.
EX:Investigation set in context and
justified.
EX: Smoke water defined
EX:Research question focussed
EX:Methodology appropriate
Prediction
Smoke water will successfully germinate more Eucalyptus pilularis than de-ionized water, and thus, as a
result of this, the post germination growth of the Eucalyptus piluiaris seeds by the smoke water will be
more effective. Effectiveness, for this experiment, is defined as the height of the seedling that emerges
from the germinated gumnut seed. If the various chemicals, such as phosphorous and nitrogenous
compounds found in the smoky remnants of organic matter function as chemical triggers, then Eucalyptus
pilularis will begin its germination out of its dormant state. These phosphorous and nitrogenous
compounds, such as NaN03, KN03, NH4Cl and NH4N03, that are naturally occurring in organic matter, are not
found in de-ionized water (Dixon et al. 1995), and hence, smoke water is predicted to germinate a larger
number of seeds and grow more after germination than de-ionized water 1.
Ex: Defines method to collect relevant
data
Preliminary experiment
The gumnut seeds were obtained from trees growing in local forestry plantations.
It was felt necessary to find out if the gumnut seeds would germinate or not.
1. 50 seeds were planted in 5 Petri dishes of potting mixture (10 seeds per dish).
2. Each dish was watered with 10 ml of de-ionised water and left for two weeks at room temperature.
3. At the end of the two weeks the numbers of seeds germinating was counted.
Results
Number of seeds germinating = 22/50
Percentage germination = 44%
The supply of seeds was considered viable enough to proceed with the experiment.
Ex: Method can be easily followed and
repeated by others.
Ex : Anticipates that method may need
modifying.Sufficient data is planned
for
Ex: Suitable control
An: Data displayed from trial run
An: Appropriate processing
Ev : Conclusion made from trial run.
1
Equipment
ͻ 10 Petri dishes
ͻ 100g of "Yates premium quality" potting mix
ͻ 5.00g of hay
ͻ 5.00g of Eucalyptus leaves
ͻ 5.00g of grass
ͻ Electronic weighing scale (±0.01g)
ͻ 100 seeds of E. pilularis that are 2.00 mm in diameter (±0.5mm)
ͻ 10.0cm ruler (±0.5mm)
ͻ 100ml of de-ionized water to create the smoke water
ͻ 100ml of de-ionized water to create the control
ͻ Tea strainer
ͻ 3 x 250ml graduated beaker (±0.4mL)
ͻ Matches
ͻ 2 Sand baths
ͻ 2 thermometers (±0.05°c)
To create the smoke water
1. Place 5g each of the hay, grass and Eucalyptus leaves into one of the 250ml beaker.
2. Ignite the organic matter with a match so that they catch on fire. Let them burn until they are all
charred.
3. Measure 100ml of de-ionized water with the second 250ml beakers. Pour this water into the first
beaker with the leaves, hay and twigs and leave to infuse for 5 hours.
4. Strain the smoke water mixture into the third measuring beaker using the tea strainer, ensuring
that you are only left with the liquid remnants.
SAFETY Care should be taken when burning the organic matter, this should be carried out in a
ventilated area and the beakers should be made of heat resistance glass.
Research question
Does smoke water stimulate germination and post germination growth of Eucalyptus pilularis seeds
compared to de-ionized water?
Method
Investigation 1 (annotated)
http://anpsa.org.au/APOL2/jun96-6.htmI
Germination and growth
1. Set the sand baths to 30 degrees Celsius and place a thermometer in each one to verify the
temperature setting.
2. Place 5 Petri dishes into one sand bath and the remaining 5 Petri dishes into another. One will be
our control and one will be our test.
3. Measure out 10 x 10.0g of the potting mix using the electronic weighing scale and place 10.0g into
each one of 10 Petri dishes. 5 dishes for smoke water treatment and 5 dishes for de-ionised water
treatment.
4. Sow 10 gumnuts into each Petri dish and submerge them into the potting mix at a consistent depth
of 0.5cm. Place the seeds towards the edges of the Petri dish so they can be observed through the
glass without having to disturb the seeds to observe them.
5. Water the control sand bath at 8:15am with 10ml of de-ionized or smoke water each day for
fourteen days.
6. After 14 days, count the number of seeds germinated (distinguished by the emergence of the
seedling) and measure the height of the emergent seedling in the test and the control groups with
the 10.0cm ruler. The seedling height is measured from the soil surface to the highest part of the
stem.
7. Repeat the set up once to ensure sufficient data.
Controlled Variables
ͻ The same volume (10ml) of liquid is added to each dish at the same time (8:15am) each day
throughout the 14 days.
ͻ All 100 E. pilularis seeds that were used in this experiment were kept within a size range of 2.00
mm in diameter
ͻ The water used to create the smoke water was de-ionized water like the control, which allowed
consistency between the control and the test groups.
1
Biology teacher support material
Ex: Safety risks assessed
Ex: Plans for sufficient data
Ex: Plans for sufficient data
Comm: Correct definition
of germination
Ex: Plans for sufficient data
Ex: Thorough consideration of the
other factors that may influence the
investigation
2
1
Biology teacher support material
2
Page 31
Investigation 1 (annotated)
ͻ
ͻ
ͻ
ͻ
Number of seeds successfully germinated
The temperature of the seeds was kept constant at 30.0°C by the sand baths.
The potting mix for the seeds was from the same brand, "Yates premium potting mix" and the mass
of potting mix used for the seeds was kept constant at 10.0g.
Same amount of light was assumed to be received for each plant as the experiment was conducted
in the same location on the same days.
The seeds were placed at a depth of 0.5cm into the soil in the Petri dish.
In order to determine the number of seeds that were germinated successfully, the number of seeds that
showed distinct cracking of the seed coat and the emergence of the seedling for both the smoke water and
the de-ionized water test groups were counted and placed into the table below. The raw data is presented
in appendix A.
Water Type
Trial
Numbers germinated (/50)
Average
%
De-ionized
1
26
25
49
2
23
Smoked
1
43
44
88
2
45
The experiment continued for fourteen days to allow for sufficient time to gauge of the effect of the
different water types, the manipulated variable. Both sand baths set at the same temperature are placed
next to each other, as specified by the method, and they are assumed to be receiving equal amounts of
light. The potting mix was taken from the same batch, so all samples could be assumed to contain the same
ratio of ingredients. Furthermore, the E. pilularis was submerged into the potting mix at a consistent depth
of 0.5cm and towards the edges of the Petri dish to allow for observations to be made through the glass
without having to disrupt the seeds to observe them.
Our method of data collection for this experiment is to count the seeds that successfully germinated from
the different Petri dishes in the control and test groups respectively, the measured variable. This is done by
observing through the side of the Petri dish whether the seed coat has broken and the seedling has
emerged. The other way to collect data in this experiment is to measure the height of the seedlings (from
the soil surface to the seedling tip) of the germinated seeds after the 14 days of the experiment. The
difference between smoke water and de-ionised water was determined using the F2 test for the
germination and the t-test for the growth of the seedlings.
Investigation 1 (annotated)
From the processed data that informs us about the number of seeds successfully germinated, we can
clearly see that smoke water germinates, on average.
Graph of de-ionised water seed germination
Comm: Data analysis can be
followed (no need for a worked
example here)
An: Uncertainties missing but not
considered relevant here for a count.
However uncertainties ±2% could
have featured for the percentage
germination data.
An: Appropriate graphical
presentation of processed data
Percentage de-ionised
water gumnut seeds
germinated
An: Appropriate method of analysis
chosen
Percentage de-ionisedwater
gumnut seeds NOT
germinated
Assumptions
ͻ The light is of the same intensity because the seeds will be set up side by side.
ͻ The de-ionized water contains the same impurities
ͻ The potting mix contains the same amount of its constituent components.
ͻ The impurities and chemical elements in the air will be the same for both sets of seeds.
ͻ The gumnut seeds are all composed of the same percentage of elements.
Comm: Clear presentation of graph
Observations
ͻ The E. pilularis seeds were no bigger than 2mm, and were brownish black in colour. There
were no obvious signs of previous germination, or cracking of the outer seed coat.
ͻ The smoke water was clearly distinctive from the de-ionized water. The de-ionized water was
clear, as one would expect if it had been filtered. The smoke water, however, had a blackish,
straw coloured hue, due to its absorption of the remnants of the burnt organic matter.
ͻ Definite germination was seen on a lot more seeds with the smoke water than with the deionized water.
ͻ The E. pilularis subjected to smoke water germinated earlier on average than the seeds
subjected to de-ionized water. Seeds with smoke water started showing first signs of
germination as early as 7 days, when their seed coats started to split to allow the seedlings to
emerge. In comparison, the de-ionized watered seeds took up to 10 days to start showing
germination.
ͻ The E. pilularis that were germinated by the smoke water tended to have larger seedlings
emerging from the split seed coat.
ͻ The E. pilularis that were watered with the smoke water had significantly larger cracking of the
seed coat, allowing for more space for the seedlings to grow and extend outwards from the
shell.
ͻ The colour of the seedlings in both experiments was a distinct dark purple colour, and leaves
appeared only on the smoke water experiment, with a maximum of 2 small, juvenile leaves
found, measuring no more than approximately 50.0mm.
An: Adequate qualitative observations
made
3
Biology teacher support material
Comm: Data table set in context.
Clear, unambiguous presentation
4
3
Biology teacher support material
4
Page 32
Investigation 1 (annotated)
ȋ2 test
In order to see if there is a significant difference between the germination of the seeds treated with smoke
water and de-ionised water a ȋ2 test was carried out.
The effect of smoke water and de-ionized water on post germination growth
This section of the experiment is designed to test the effectiveness of gumnut seed germination, depending
on the type of water it received, either de-ionized or smoke water. Effectiveness was determined by the
height of the seedling that emerged from the seed coat of the germinated gumnut seeds. The higher the
seedling the more effective the water is on germination. The raw data is presented in appendix A.
Height of seedlings for germinated seeds
Overall average height Overall standard
Trial
Water Type
Trial
Trial average of
/mm ±0.5mm
deviation
seedling height /mm Standard
Deviation
±0.5mm
De-ionized
1
13.0
13.4
23.4
13.6
2
11.8
13.9
Smoked
1
57.8
24.5
59.5
12.4
2
61.1
22.3
Null Hypothesis: Smoke water does not affect germination of gumnut seeds
Alternative Hypothesis: Smoke water affects germination of gumnut seeds
Smoke water
88
12
100
Germinated
Not germinated
Column total
De-ionised water
49
51
100
Row total
137
63
200
Proportion of seed germinating = 137/200 = 68.5%
Proportion of seeds not germinating = 100 – 68.5 = 31.5%
Difference
Positive
difference
O
E
O-E
IO-EI
(IO-EI)2/E
88
49
12
51
68.5
68.5
31.5
31.5
19.5
-19.5
-19.5
19.5
19.5
19.5
19.5
19.5
ȋ2calc
5.55
5.55
12.07
12.07
35.25
An: The candidate considers the
reliability of the data though it could
be argued that ungerminated seeds
should not be included here. These
results (0cm growth) skew the
distribution so that it is not normally
distributed.
90.0
Number of degrees of freedom = (rows – 1) x (columns – 1) = (2-1) x (2-1) = 1
ȋ2crit = 3.84 for p=0.05
Comm:ata processing can be
followed.
Since the test value for ȋ2calc = 35.25 is a lot greater than the critical value ȋ2crit = 3.84 we must reject the
Null Hypothesis and accept the Alternative Hypothesis. The test value is significant for p < 0.001
An: Processed data correctly
interpreted
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
An: Successful data analysis
completed. Conclusion can be
deduced.
Smoke water
De-ionised water
Treatment
6
5
Biology teacher support material
Comm: Data table set in context.
Clear, unambiguous presentation.
Processing can be followed, a worked
example is not expected here.
Processing can be followed. Correct
conventions for uncertainties
The effect of smoke water on the growth of
gumnut (Eucalyptus pilularis) seedlings
Error bars = ±1 standard deviation
Average seedling length / mm
Expected
frequency
Comm : Terminology is imprecise
here. Strictly speaking this is post
germination growth
On first observation of the processed data, it can be seen that smoked water clearly has a higher average
seedling height than the de-ionized water whilst also having a lower standard deviation. This indicated that
the smoked water seeds seedling grew higher than the de-ionized water. The error bars in the graph below
suggest that there may be a significant difference between the affects of the treatment on seedling growth.
However, the range of variation in the results as given by the standard deviations is large especially for the deionised water treatment trials. To verify this, a t-test was carried out on the data.
Expected number of smoke water treated seeds to germinate = 68.5% of 100 = 68.5
Expected number of de-ionised water treated seeds to germinate = 68.5% of 100 = 68.5
Expected number of smoke water treated seeds not to germination = 31.5% of 100 = 31.5
Expected number of de-ionised water treated seeds not to germinate = 31.5% of 100 = 31.5
Observed
frequency
Investigation 1 (annotated)
5
Biology teacher support material
6
Page 33
Investigation 1 (annotated)
t-test
In order to statistically test whether the shoot of smoke water germinated gumnut seedlings grew more than the
de-ionized water, a two-tailed t-test for independent samples was carried out to investigate whether there is a
significant difference between the growth of the seedlings.
ͻ Null Hypothesis - the smoke water has no effect on post germination growth of the gumnut
seedlings.
ͻ Alternative Hypothesis - the smoke water does have an effect on post germination growth of the
gumnut seedlings.
An: Appropriate method of analysis
chosen.
t-test formula:
degrees of freedom = n1 + n2 – 1 = 198
tcalc = 17.4
tcrit (p=0.05) = 1.97
Because our test t value tcalc = 17.4 is greater than the critical value tcrit =1.97 at p = 0.05, we can accept the
alternative hypothesis, that the smoke water significantly stimulates the growth of the gumnut seedlings
germinated. The test value is significant for p < 0.001
Evaluation of Weaknesses with suggested improvements
The potting mixture used was obtained from the local garden shop, and whilst the same brand and the same
amount of the potting mixture was used for both seeds in the experiment, the potting mixture may have
contained impurities which could potentially have enhanced or reduced the ability of the seeds to germinate,
especially because the Yates brand "Contains trace elements to add extra vital nutrients" 2. Some of the chemicals
from the smoke water also could have potentially reacted with some of the ingredients of the potting mix and
rendered them useless, however the seeds watered with de-ionized water may not have had this potential
problem. To improve this, I could have used a different support for the seeds such as cotton wool or filter paper.
Comm: Processing can be followed.
An: Successful data analysis and
interpretation completed
Ev: Student considers the reliability of
the data and considers the impact of
experimental uncertainty
Ev: Sensible suggested improvement.
Using different types of leaves, twigs and hay to create the smoke water would give you different chemicals, as
each has a differing composition of chemicals, some of which may be beneficial for germination, and some of
which wouldn't. For this experiment, I could have used only one variable like hay, instead of twigs and leaves as
well. This would narrow my scope of results down as well and I would potentially be able to pinpoint the specific
chemical, or source of the chemical, that allows gumnuts to germinate successfully. It may be found that twigs, for
example, don't enhance seed germination but leaves do. By singling out the element that best enhances seed
germination, further experiments could be carried out, and the exact chemical could be identified, that best
enhances the seeds germination.
Ev: Feasible extension proposed.
Combined with this, I could have used gumnut seeds that were all the same weight rather than the same size in
diameter. I tried to use gumnut seeds that were only 2.00mm in diameter, however it would have been better
served to use seeds that all had a constant weight of 0.2g for example, as then I could have assumed that each
seed contained the same amounts and composition of nutrients, enzymes and other chemicals inside it.
Ev: Suggested improvement
impractical
To further narrow my scope of the experiment, I could have tested the effects of different concentrations of the
smoke water as well. Instead of only using a 1:10 ratio of 1 part twigs, hay and leaves to 10 parts de-ionized water,
I could have tested a ratio of 1:5 with 1 part twigs, hay and leaves and 5 parts de- ionized water. Working out the
optimum concentration of smoke water would help this experiment as better and clearer results could be
obtained.
Investigation 1 (annotated)
Conclusion
In conclusion, the experiment supported my hypothesis that smoke water will successfully germinate more
Eucalyptus pilularis than de-ionized water. Furthermore, the subsequent growth of the Eucalyptus pilularis seeds
by the smoke water was found to be more effective than the de-ionized water due to the significantly taller
seedlings of the Eucalyptus pilularis that were exposed to the smoke water. This could because the various
chemicals, such as phosphorous and nitrogenous compounds found in the smoky remnants of the burnt organic
matter (in my case, the burnt leaves, hay and twigs) acted as chemical triggers for the E. pilularis to begin its
germination out of its dormant state and stimulate its subsequent growth. While all of the active compounds in
smoke have not yet been identified, a large majority of the compounds present in the smoke water mixture
(NaN03, KN03, NH4CI and NH4N03) are water soluble, thus they are easily able to be taken in by the gumnut seed
and, once inside the seed, they are used as these so called "chemical triggers” to start germination. These
chemical triggers work by altering the levels of chemicals that the seed maintains in homeostasis, once the seed
has registered these differing levels of phosphorous and nitrogenous compounds, it stimulates the germination of
the seed. There are, however, compounds called butenolides that have confirmed germination-promoting action.
These butenolides are produced by some plants on exposure to high temperatures and smoke caused by bush
fires. In particular, botanists Flematti, Ghisalberti, Dixon and Trengove isolated a particular butenolide called 3methyl-2H-furo[2,3-c]pyran-2-one, which was found to trigger seed germination in plants whose reproduction is
fire-dependent, such as the E. pilularis used in my experiment 3. One theory about how this butenolide called 3methyl-2tf-furo[2,3-c@pyran-2-one is formed by the plant is given to us by Light, Berger and van Steden, who
hypothesized that this particular butenolide was created from cellulose within the plant, and this substance,
created by the cellulose, stimulated the seeds reproductive cycle, and hence, germination 4. The two pie graphs
that show the percentage of seeds germinated for the smoke water experiment and de-ionized water experiment
respectively, furthermore indicate that my hypothesis was correct, with 88% of the smoke watered seeds
successfully germinating compared to only 47% of the de-ionized water seeds germinating. This was backed up
wiWK P\ Ȥ2-test that accurately concluded that we could reject the null hypothesis, with a 95% degree of
confidence, that the smoke water successfully germinated more seeds that the de-ionized water. The t-test on the
seedling growth shows that the smoke water has a significant positive effect on the gumnut seedlings.
Ev : Successful interpretation of the
results. Relevant justified conclusion
drawn
Bibliography
Yates Gardening Ltd Sydney Australia http://www.yates.com.au/products/pots-and-potting-mix/all-purposepotting-mix/yates-premium-potting-mix/ Last visited July 10 2011
Gavin R. Flematti, Emilio L. Ghisalberti, Kingsley W. Dixon and Robert D. Trengove A Compound from Smoke That
Promotes Seed Germination http://www.sciencemag.org/content/305/5686/977 Science 13 August 2004:
Vol. 305 no. 5686 p. 977Published Online July 8 2004
Marnie E. Light,Barend V. Burger and Johannes van Staden Formation of a Seed Germination Promoter from
Carbohydrates and Amino Acids http://pubs.acs.org/doi/abs/10.1021/jf050710u J. Agric. Food Chem., 2005,
53 (15), pp 5936–5942 Publication Date (Web): July 1, 2005
Ev: Unsafe assumption
Ev: Feasible extension proposed.
3
4
2
http://www.yates.com.au/products/pots-and-potting-mix/all-purpose-potting-mix/yates-premium-potting-mix/
7
Biology teacher support material
Ev: Compare to relevant scientific
theory
http://www.sciencemag.org/content/305/5686/977
http//pubs.acs.org/doi/abs/10.1021/jf050710u
8
7
Biology teacher support material
8
Page 34
Investigation 1 (annotated)
Investigation 1 (annotated)
Seeds watered with De-ionized water (Trial 1 )
Appendix A - raw data tables
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
An: Raw data recorded includes
uncertainties
Seeds watered with Smoke Water (Trial 1)
Did the seed Germinate
Height of seedling in / mm ±0.5mm
Yes
56.0
Yes
71.0
Yes
73.0
Yes
67.0
Yes
54.0
No
0
Yes
58.0
Yes
70.0
Yes
66.0
Yes
61.0
Yes
64.0
Yes
71.0
No
0
No
0
Yes
59.0
Yes
67.0
Yes
58.0
Yes
63.0
Yes
62.0
Yes
64.0
Yes
72.0
Yes
75.0
No
0.0
Yes
68.0
Yes
64.0
Yes
69.0
Yes
70.0
No
0
Yes
52.0
No
0
Yes
79.0
Yes
81.0
Yes
83.0
Yes
74.0
Yes
74.0
Yes
78.0
Yes
63.0
Yes
69.0
Yes
58.0
Yes
70.0
Yes
68.0
Yes
62.0
Yes
63.0
Yes
68.0
Yes
58.0
Yes
81.0
Yes
68.0
Yes
73.0
Yes
67.0
No
0
Comm: Decimal places should be
consistent
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Did the seed Germinate
Yes
Yes
Yes
No
No
No
Yes
No
Yes
No
Yes
No
No
Yes
Yes
No
Yes
No
Yes
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
No
Yes
No
Yes
No
No
Yes
Yes
No
Yes
Height of seedling in / mm ±0.5mm
18
27.0
19.0
0
0
0
24.0
0
25.0
0
28.0
0
0
17.0
23.0
0
16.0
0
26.0
27.0
15.0
0
0
27.0
0
21.0
22.0
0
27.0
37.0
0
0
26.0
31.0
0
0
27.0
41.0
0
0
0
25.0
0
19.0
0
0
37.0
22.0
0
25.0
9
10
Biology teacher support material
9
Biology teacher support material
10
Page 35
Investigation 1 (annotated)
Investigation 1 (annotated)
Seeds watered with De-Ionized water (Trial 2)
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Seeds watered with Smoke water (Trial 2)
Did the seed Germinate
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Seed Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
Height of seedling in mm / ±0.5mm
72.0
73.0
0
72.0
57.0
74.0
79.0
62.0
78.0
64.0
72.0
79.0
72.0
57.0
56.0
83.0
63.0
0
72.0
63.0
0
58.0
81.0
57.0
62.0
0
74.0
73.0
83.0
58.0
74.0
57.0
63.0
79.0
60.0
74.0
79.0
57.0
86.0
53.0
56.0
67.0
63.0
68.0
54.0
68.0
68.0
0
62.0
72.0
Did the seed Germinate
No
Yes
Yes
Yes
No
No
No
No
Yes
No
Yes
No
No
Yes
Yes
No
No
No
Yes
Yes
Yes
No
No
Yes
No
Yes
Yes
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
No
Yes
No
Yes
No
No
Yes
Yes
No
No
Height of seedling in / mm ±0.5mm
0
26.0
21.0
23.0
0
0
0
0
31.0
0
14.0
0
0
16.0
18.0
0
0
0
26.0
31.0
25.0
0
0
21.0
0
31.0
26.0
0
23.0
36.0
0
0
14.0
23.0
0
0
23.0
27.0
0
0
0
24.0
0
45.0
0
0
42.0
23.0
0
0
11
12
Biology teacher support material
11
Biology teacher support material
12
Page 36
Page 37
1
2
3
4
0123
456
7 8 9 10 11 12 13
PERSONAL ENGAGEMENT
The student’s report does not reach a standard described by the descriptors below.
•
STUDENT
DRAFT self-assessment
0
1
2
0
1
2
3
4
5
6
0
1
2
3
4
5
6
•
MARK
POINTS
• The evidence of personal engagement with the exploration is limited with
little independent thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under
investigation does not demonstrate personal significance, interest or curiosity.
• There is little evidence of personal input and initiative in the designing,
implementation or presentation of the investigation.
• The evidence of personal engagement with the exploration is clear with
significant independent thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under
investigation demonstrates personal significance, interest or curiosity.
• There is evidence of personal input and initiative in the designing,
implementation or presentation of the investigation.
EXPLORATION
• The student’s report does not reach a standard described by the descriptors below.
•
• The topic of the investigation is identified and a research question of some
relevance is stated but it is not focused.
• The background information provided for the investigation is superficial or of
limited relevance and does not aid the understanding of the context of the
investigation.
• The methodology of the investigation is only appropriate to address the research
question to a very limited extent since it takes into consideration few of the
significant factors that may influence the relevance, reliability and sufficiency of
the collected data.
• The report shows evidence of limited awareness of the significant safety, ethical
or environmental issues that are relevant to the methodology of the
investigation*.
• The topic of the investigation is identified and a relevant but not fully focused
research question is described.
• The background information provided for the investigation is mainly appropriate
and relevant and aids the understanding of the context of the investigation.
• The methodology of the investigation is mainly appropriate to address the
research question but has limitations since it takes into consideration only some of
the significant factors that may influence the relevance, reliability and sufficiency
of the collected data.
• The report shows evidence of some awareness of the significant safety, ethical or
environmental issues that are relevant to the methodology of the investigation*.
• The topic of the investigation is identified and a relevant and fully focused
research question is clearly described.
• The background information provided for the investigation is entirely appropriate
and relevant and enhances the understanding of the context of the investigation.
• The methodology of the investigation is highly appropriate to address the research
question because it takes into consideration all, or nearly all, of the significant
factors that may influence the relevance, reliability and sufficiency of the
collected data.
• The report shows evidence of full awareness of the significant safety, ethical or
environmental issues that are relevant to the methodology of the investigation*.
ANALYSIS
• The student’s report does not reach a standard described by the descriptors below.
• The report includes insufficient relevant raw data to support a valid conclusion
to the research question.
• Some basic data processing is carried out but is either too inaccurate or too
insufficient to lead to a valid conclusion.
• The report shows evidence of little consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is incorrectly or insufficiently interpreted so that the
conclusion is invalid or very incomplete.
• The report includes relevant but incomplete quantitative and qualitative raw data
that could support a simple or partially valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out that could lead to a
broadly valid conclusion but there are significant inaccuracies and inconsistencies
in the processing.
• The report shows evidence of some consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is interpreted so that a broadly valid but incomplete or limited
conclusion to the research question can be deduced.
• The report includes sufficient relevant quantitative and qualitative raw data that
could support a detailed and valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out with the accuracy
required to enable a conclusion to the research question to be drawn that is fully
consistent with the experimental data.
• The report shows evidence of full and appropriate consideration of the impact of
measurement uncertainty on the analysis.
• The processed data is correctly interpreted so that a completely valid and detailed
conclusion to the research question can be deduced.
•
5
14 15 16
6
17 18 19
7
20 21 22 23 24
0
1
2
3
4
5
6
0
1
2
3
4
Page 38
EVALUATION
• The student’s report does not reach a standard described by the descriptors below.
• A conclusion is outlined which is not relevant to the research question or is not
supported by the data presented.
• The conclusion makes superficial comparison to the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are outlined but are restricted to an account of the practical or
procedural issues faced.
• The student has outlined very few realistic and relevant suggestions for the
improvement and extension of the investigation.
• A conclusion is described which is relevant to the research question and
supported by the data presented.
• A conclusion is described which makes some relevant comparison to the accepted
scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are described and provide evidence of some awareness of the
methodological issues* involved in establishing the conclusion.
• The student has described some realistic and relevant suggestions for the
improvement and extension of the investigation.
• A detailed conclusion is described and justified which is entirely relevant to the
research question and fully supported by the data presented.
• A conclusion is correctly described and justified through relevant comparison to
the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are discussed and provide evidence of a clear understanding of
the methodological issues* involved in establishing the conclusion.
• The student has discussed realistic and relevant suggestions for the improvement
and extension of the investigation.
•
COMMUNICATION
•
• The student’s report does not reach a standard described by the descriptors below.
• The presentation of the investigation is unclear, making it difficult to
understand the focus, process and outcomes.
• The report is not well structured and is unclear: the necessary information on
focus, process and outcomes is missing or is presented in an incoherent or
disorganized way.
• The understanding of the focus, process and outcomes of the investigation is
obscured by the presence of inappropriate or irrelevant information.
• There are many errors in the use of subject-specific terminology and
conventions*.
• The presentation of the investigation is clear. Any errors do not hamper
understanding of the focus, process and outcomes.
• The report is well structured and clear: the necessary information on focus,
process and outcomes is present and presented in a coherent way.
• The report is relevant and concise thereby facilitating a ready understanding of the
focus, process and outcomes of the investigation.
• The use of subject-specific terminology and conventions is appropriate and
correct. Any errors do not hamper understanding.
•
Page 39
1
2
3
4
0123
456
7 8 9 10 11 12 13
PERSONAL ENGAGEMENT
The student’s report does not reach a standard described by the descriptors below.
•
STUDENT
DRAFT peer review
0
1
2
0
1
2
3
4
5
6
0
1
2
3
4
5
6
•
MARK
POINTS
• The evidence of personal engagement with the exploration is limited with
little independent thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under
investigation does not demonstrate personal significance, interest or curiosity.
• There is little evidence of personal input and initiative in the designing,
implementation or presentation of the investigation.
• The evidence of personal engagement with the exploration is clear with
significant independent thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under
investigation demonstrates personal significance, interest or curiosity.
• There is evidence of personal input and initiative in the designing,
implementation or presentation of the investigation.
EXPLORATION
• The student’s report does not reach a standard described by the descriptors below.
•
• The topic of the investigation is identified and a research question of some
relevance is stated but it is not focused.
• The background information provided for the investigation is superficial or of
limited relevance and does not aid the understanding of the context of the
investigation.
• The methodology of the investigation is only appropriate to address the research
question to a very limited extent since it takes into consideration few of the
significant factors that may influence the relevance, reliability and sufficiency of
the collected data.
• The report shows evidence of limited awareness of the significant safety, ethical
or environmental issues that are relevant to the methodology of the
investigation*.
• The topic of the investigation is identified and a relevant but not fully focused
research question is described.
• The background information provided for the investigation is mainly appropriate
and relevant and aids the understanding of the context of the investigation.
• The methodology of the investigation is mainly appropriate to address the
research question but has limitations since it takes into consideration only some of
the significant factors that may influence the relevance, reliability and sufficiency
of the collected data.
• The report shows evidence of some awareness of the significant safety, ethical or
environmental issues that are relevant to the methodology of the investigation*.
• The topic of the investigation is identified and a relevant and fully focused
research question is clearly described.
• The background information provided for the investigation is entirely appropriate
and relevant and enhances the understanding of the context of the investigation.
• The methodology of the investigation is highly appropriate to address the research
question because it takes into consideration all, or nearly all, of the significant
factors that may influence the relevance, reliability and sufficiency of the
collected data.
• The report shows evidence of full awareness of the significant safety, ethical or
environmental issues that are relevant to the methodology of the investigation*.
ANALYSIS
• The student’s report does not reach a standard described by the descriptors below.
• The report includes insufficient relevant raw data to support a valid conclusion
to the research question.
• Some basic data processing is carried out but is either too inaccurate or too
insufficient to lead to a valid conclusion.
• The report shows evidence of little consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is incorrectly or insufficiently interpreted so that the
conclusion is invalid or very incomplete.
• The report includes relevant but incomplete quantitative and qualitative raw data
that could support a simple or partially valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out that could lead to a
broadly valid conclusion but there are significant inaccuracies and inconsistencies
in the processing.
• The report shows evidence of some consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is interpreted so that a broadly valid but incomplete or limited
conclusion to the research question can be deduced.
• The report includes sufficient relevant quantitative and qualitative raw data that
could support a detailed and valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out with the accuracy
required to enable a conclusion to the research question to be drawn that is fully
consistent with the experimental data.
• The report shows evidence of full and appropriate consideration of the impact of
measurement uncertainty on the analysis.
• The processed data is correctly interpreted so that a completely valid and detailed
conclusion to the research question can be deduced.
•
5
14 15 16
6
17 18 19
7
20 21 22 23 24
0
1
2
3
4
5
6
0
1
2
3
4
Page 40
EVALUATION
• The student’s report does not reach a standard described by the descriptors below.
• A conclusion is outlined which is not relevant to the research question or is not
supported by the data presented.
• The conclusion makes superficial comparison to the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are outlined but are restricted to an account of the practical or
procedural issues faced.
• The student has outlined very few realistic and relevant suggestions for the
improvement and extension of the investigation.
• A conclusion is described which is relevant to the research question and
supported by the data presented.
• A conclusion is described which makes some relevant comparison to the accepted
scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are described and provide evidence of some awareness of the
methodological issues* involved in establishing the conclusion.
• The student has described some realistic and relevant suggestions for the
improvement and extension of the investigation.
• A detailed conclusion is described and justified which is entirely relevant to the
research question and fully supported by the data presented.
• A conclusion is correctly described and justified through relevant comparison to
the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are discussed and provide evidence of a clear understanding of
the methodological issues* involved in establishing the conclusion.
• The student has discussed realistic and relevant suggestions for the improvement
and extension of the investigation.
•
COMMUNICATION
•
• The student’s report does not reach a standard described by the descriptors below.
• The presentation of the investigation is unclear, making it difficult to
understand the focus, process and outcomes.
• The report is not well structured and is unclear: the necessary information on
focus, process and outcomes is missing or is presented in an incoherent or
disorganized way.
• The understanding of the focus, process and outcomes of the investigation is
obscured by the presence of inappropriate or irrelevant information.
• There are many errors in the use of subject-specific terminology and
conventions*.
• The presentation of the investigation is clear. Any errors do not hamper
understanding of the focus, process and outcomes.
• The report is well structured and clear: the necessary information on focus,
process and outcomes is present and presented in a coherent way.
• The report is relevant and concise thereby facilitating a ready understanding of the
focus, process and outcomes of the investigation.
• The use of subject-specific terminology and conventions is appropriate and
correct. Any errors do not hamper understanding.
•
Page 41
1
2
3
4
0123
456
7 8 9 10 11 12 13
PERSONAL ENGAGEMENT
The student’s report does not reach a standard described by the descriptors below.
•
STUDENT
MARK
DRAFT post-peer review self-assmt. POINTS
0
1
2
0
1
2
3
4
5
6
0
1
2
3
4
5
6
•
• The evidence of personal engagement with the exploration is limited with
little independent thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under
investigation does not demonstrate personal significance, interest or curiosity.
• There is little evidence of personal input and initiative in the designing,
implementation or presentation of the investigation.
• The evidence of personal engagement with the exploration is clear with
significant independent thinking, initiative or creativity.
• The justification given for choosing the research question and/or the topic under
investigation demonstrates personal significance, interest or curiosity.
• There is evidence of personal input and initiative in the designing,
implementation or presentation of the investigation.
EXPLORATION
• The student’s report does not reach a standard described by the descriptors below.
•
• The topic of the investigation is identified and a research question of some
relevance is stated but it is not focused.
• The background information provided for the investigation is superficial or of
limited relevance and does not aid the understanding of the context of the
investigation.
• The methodology of the investigation is only appropriate to address the research
question to a very limited extent since it takes into consideration few of the
significant factors that may influence the relevance, reliability and sufficiency of
the collected data.
• The report shows evidence of limited awareness of the significant safety, ethical
or environmental issues that are relevant to the methodology of the
investigation*.
• The topic of the investigation is identified and a relevant but not fully focused
research question is described.
• The background information provided for the investigation is mainly appropriate
and relevant and aids the understanding of the context of the investigation.
• The methodology of the investigation is mainly appropriate to address the
research question but has limitations since it takes into consideration only some of
the significant factors that may influence the relevance, reliability and sufficiency
of the collected data.
• The report shows evidence of some awareness of the significant safety, ethical or
environmental issues that are relevant to the methodology of the investigation*.
• The topic of the investigation is identified and a relevant and fully focused
research question is clearly described.
• The background information provided for the investigation is entirely appropriate
and relevant and enhances the understanding of the context of the investigation.
• The methodology of the investigation is highly appropriate to address the research
question because it takes into consideration all, or nearly all, of the significant
factors that may influence the relevance, reliability and sufficiency of the
collected data.
• The report shows evidence of full awareness of the significant safety, ethical or
environmental issues that are relevant to the methodology of the investigation*.
ANALYSIS
• The student’s report does not reach a standard described by the descriptors below.
• The report includes insufficient relevant raw data to support a valid conclusion
to the research question.
• Some basic data processing is carried out but is either too inaccurate or too
insufficient to lead to a valid conclusion.
• The report shows evidence of little consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is incorrectly or insufficiently interpreted so that the
conclusion is invalid or very incomplete.
• The report includes relevant but incomplete quantitative and qualitative raw data
that could support a simple or partially valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out that could lead to a
broadly valid conclusion but there are significant inaccuracies and inconsistencies
in the processing.
• The report shows evidence of some consideration of the impact of measurement
uncertainty on the analysis.
• The processed data is interpreted so that a broadly valid but incomplete or limited
conclusion to the research question can be deduced.
• The report includes sufficient relevant quantitative and qualitative raw data that
could support a detailed and valid conclusion to the research question.
• Appropriate and sufficient data processing is carried out with the accuracy
required to enable a conclusion to the research question to be drawn that is fully
consistent with the experimental data.
• The report shows evidence of full and appropriate consideration of the impact of
measurement uncertainty on the analysis.
• The processed data is correctly interpreted so that a completely valid and detailed
conclusion to the research question can be deduced.
•
5
14 15 16
6
17 18 19
7
20 21 22 23 24
0
1
2
3
4
5
6
0
1
2
3
4
Page 42
EVALUATION
• The student’s report does not reach a standard described by the descriptors below.
• A conclusion is outlined which is not relevant to the research question or is not
supported by the data presented.
• The conclusion makes superficial comparison to the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are outlined but are restricted to an account of the practical or
procedural issues faced.
• The student has outlined very few realistic and relevant suggestions for the
improvement and extension of the investigation.
• A conclusion is described which is relevant to the research question and
supported by the data presented.
• A conclusion is described which makes some relevant comparison to the accepted
scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are described and provide evidence of some awareness of the
methodological issues* involved in establishing the conclusion.
• The student has described some realistic and relevant suggestions for the
improvement and extension of the investigation.
• A detailed conclusion is described and justified which is entirely relevant to the
research question and fully supported by the data presented.
• A conclusion is correctly described and justified through relevant comparison to
the accepted scientific context.
• Strengths and weaknesses of the investigation, such as limitations of the data and
sources of error, are discussed and provide evidence of a clear understanding of
the methodological issues* involved in establishing the conclusion.
• The student has discussed realistic and relevant suggestions for the improvement
and extension of the investigation.
•
COMMUNICATION
•
• The student’s report does not reach a standard described by the descriptors below.
• The presentation of the investigation is unclear, making it difficult to
understand the focus, process and outcomes.
• The report is not well structured and is unclear: the necessary information on
focus, process and outcomes is missing or is presented in an incoherent or
disorganized way.
• The understanding of the focus, process and outcomes of the investigation is
obscured by the presence of inappropriate or irrelevant information.
• There are many errors in the use of subject-specific terminology and
conventions*.
• The presentation of the investigation is clear. Any errors do not hamper
understanding of the focus, process and outcomes.
• The report is well structured and clear: the necessary information on focus,
process and outcomes is present and presented in a coherent way.
• The report is relevant and concise thereby facilitating a ready understanding of the
focus, process and outcomes of the investigation.
• The use of subject-specific terminology and conventions is appropriate and
correct. Any errors do not hamper understanding.
•
Page 43
Page 44
Page 45
Page 46
Page 47
Page 48
Page 49
Page 50
Page 51
Page 52
Page 53
Page 54
Page 55
Page 56
Page 57
Page 58
IB STATISTICS HANDBOOK
Why Statistics? ...................................................................................4
What should you know? ....................................................................4
IB STATISTICS
Types of Data .....................................................................................5
Categorical data..................................................................................5
Ordinal Data .......................................................................................5
Numerical data ...................................................................................5
HANDBOOK
Frequency ...........................................................................................5
Other Data ..........................................................................................5
Sampling Techniques ..........................................................................6
Mag Karl Schauer BSc
Random sampling ................................................................................6
Systematic Sampling ............................................................................6
Stratified Sampling ...............................................................................6
Descriptive statistics ...........................................................................7
Averages ...................................................................................................7
Mean ..................................................................................................7
Median ...............................................................................................7
Mode ..................................................................................................7
Measures of central Tendency .....................................................................8
Standard deviation ..............................................................................8
Minimum, Maximum and Range ...........................................................8
Quartiles and Interquartile Ranges ........................................................8
Histograms .................................................................................................9
xkcd.com
The normal curve ......................................................................................10
Data tables .......................................................................................11
Title ...................................................................................................11
Labels ................................................................................................11
Data ..................................................................................................11
Summary statistics ..............................................................................11
Formatting .........................................................................................11
Graphical techniques ........................................................................13
Formatting and Labelling ...........................................................................13
2
Page 59
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
Bar graphs ...............................................................................................13
Line graphs ..............................................................................................14
Scatter plots and correlation ......................................................................14
WHY STATISTICS?
Correlation ........................................................................................15
Extrapolation and Interpolation ...........................................................17
An academic investigation is a way to try to answer a question. This question must be
defined, and a method determined to collect appropriate data. Predictions are then made
based on the knowledge gained by answering previous questions in previous
investigations. So where do statistics come in? Statistics are the tool you need to boil
down all of your carefully collected data into a clear answer. Importantly, they also tell you
how sure you can be of that answer.
Other graphs ............................................................................................19
Hypothesis Testing ...........................................................................20
Testing for differences ...............................................................................21
The T-test ...........................................................................................21
ANOVA ............................................................................................24
W H AT S H O U L D YO U
KNOW?
The Mann-Whitney U test....................................................................24
Testing for Correlation ..............................................................................25
The Pearson correlation test ................................................................25
Spearman Rank correlation test ..........................................................26
In order to complete your internally assessed work or data-based extended essays in the IB,
you will need to apply some basic statistics. You will need to summarise and describe your
data using descriptive statistics like averages and standard deviations. Then you will need
to present your data in tables and graphs. Finally, depending on the investigation, you may
need to perform a hypothesis test or other calculations to definitively answer your
question. You won’t generally be expected to do the sometimes complicated calculations
by hand. Tools like spreadsheet software (Excel, LibreOffice etc.) or your Tinspire
handheld make many calculations a trivial matter of entering numbers. You can and will
likely need to look up tutorials for using your software online, as there are many different
softwares and platforms, but there are many tutorials readily available (see this list of
resources for help). What is trickier, and mostly up to you, is deciding what statistics to
apply in what circumstances, and understanding what those calculations tell you. This
handbook should help you with those decisions and that understanding.
Other statistical tests .................................................................................27
The chi-squared test ............................................................................27
Nearest neighbour analysis ................................................................27
Critical Value tables ..........................................................................30
How to use spreadsheet software ...................................................31
More Help and further reading ........................................................32
This handbook is intended to be used digitally, and contains some cross-referencing and
external links. Underlined text, as well as the table of contents can be clicked to take you
where you want to go.
3
4
Page 60
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
T YPES OF DATA
SAMPLING TECHNIQUES
You might encounter all types of data in your investigations. It is important to distinguish
between a few different types of data because not all statistical techniques work with all
types of data.
Since you will never have enough time or resources to measure all of the possible data
points in the population (and if you use statistics, you shouldn’t need to), you will always
only measure a small portion of all of the possible points called a sample. But which data
points should go into the sample? In order to have a fair test, it is important that each
possible data point is equally likely to be chosen for the sample. That is to say, there
should be no sampling bias. In order to do this, you will need to use a sampling strategy
that fits your investigation.
C ATEGORIC AL DATA
This type of data fits into defined categories. For example: red, green and blue as options
for people’s favourite colour are categories.
RANDOM SAMPLING
ORDINAL DATA
Just to be clear, it is not sufficient to claim that a sample is random if you have simply
chosen ‘at random’ places to measure. You might have a subconscious bias for certain
measurements. To be truly random, you will need to assign a number to each possible data
point, and use a random number generator to tell you which measurements to collect.
This is similar to categorical data, but there is a clear order of the groups. For example:
low, medium and high income categories. These categories don’t always or necessarily
have the same distances between them.
NUMERIC AL DATA
This type of data includes measurements of all kinds. There is a clear order, as in ordinal
data, but the distances between data points are clearly defined. Length, mass and speed are
all numerical data.
One way to do this is with the RAND() function in spreadsheet software. Simply enter
‘=RAND()’ in a cell, and it will show you a random number between 0 and 1. You can
then multiply this number by whatever you need to in order to have a random number
between 0 and that number. For example: if you wanted a random number between 0 and
100, you could simply type ‘=RAND()*100’ into a cell.
FREQUENCY
SYSTEMATIC SAMPLING
When a statistician uses the word frequency, they generally mean a count of the number of
things. ‘What is the frequency of…’ can usually be translated to ‘ how many…’
In this technique, you simply choose to sample at regular intervals. For example, you
might choose to make a measurement every ten meters on a transect.
OTHER DATA
STRATIFIED SAMPLING
This list is certainly not complete. There are other specialty types of data that you might
encounter, but these should be sufficient for most of your investigations.
This more complicated sampling method is only used if the population is made up of
different sub-sets that make up different proportions of the whole. It might be important
to make sure that one sub-set isn’t being over- or under-represented in the data. This is
most commonly used with survey data.
5
6
Page 61
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
MEASURES OF CENTRAL TENDENCY
DESCRIPTIVE STATISTICS
Once you have collected your data, you will need to boil it down. Descriptive statistics,
sometimes called summary statistics, do just that: they help your reader see the general
trends and patterns in your data.
Just knowing the average of a group only tells part of the story. Another important aspect
is the spread in the data. How similar are the data points to each other? Are they all
basically the same, or are there wild differences? These measurements all describe how
spread out the data is around the mean.
AV E R AG E S
STANDARD DEVIATION
MEAN
Standard deviation is the average of the differences between each data point and the mean.
A large standard deviation, relative to the size of the measurement, means that the data is
very spread out. This means that there are generally large differences between data points.
A small standard deviation, relative to the size of the measurements, indicates that the
measurements are close together, or that they ‘agree’.
This is generally what is meant when someone says ‘average’. It is the sum of the values
divided by the number of values. This is the most common way to summarise sample data.
MEDIAN
This is the ‘middle’ data point. There are just as many data points higher and lower than
this one in the sample. This measure of average is more likely to be used if the sample is
distributed in a strange way, or if outliers might strongly affect the mean. For example, in a
sample of measuring personal wealth, one or two billionaires in the sample might heavily
skew the mean to show an average income that is higher than almost every single
individual. In this case it would be appropriate to use median income to better represent
the sample.
Standard deviations are useful for numerical data. If the data is not numerical, or is not
normal, you will need a different way to show the spread of the data.
MINIMUM, MAXIMUM AND RANGE
To give a basic idea of the spread of your data it is often good to include the total range of
the data, that is to point out the highest and lowest measured values (maximum and
minimum), and the distance between them (range).
MODE
QUARTILES AND INTERQUARTILE RANGES
Mode is much less commonly used. It is the ‘most common’ data point. Or in other words,
the data point with the highest frequency.
These measures of spread apply to the median in a similar way to how the standard
deviation applies to the mean. Here is how they are calculated:
Example given the five values in the sample:
1. Arrange the data in rank order and divided into four equal parts, each containing an
equal number of values. Each section is called a quartile. The quartile containing
the highest values is the upper quartile, while the one with the lowest values is the
lower quartile.
2. The Upper Quartile Value (UQV) or Q1 is the mean of the lowest value in the
upper quartile and the highest value in the quartile below it.
3. The Lower Quartile Value (LQV) or Q3 is the mean of the highest value in the
lower quartile and the lowest value in the quartile above it.
4. The Inter-Quartile Range (IQR) is the difference between both values calculated in
#2 and #3. A high IQR means the data is very dispersed, while a low IQR means the
data is less dispersed.
1 , 3, 3, 5, 8, 11, 12
Mean
43/7
Median
Mode
1, 3, 3, 5, 8, 11, 12 1, 3, 3, 5, 8, 11, 12
6.1
5
3
average
middle
most common
7
8
Page 62
IB STATISTICS HANDBOOK
For example:
IB STATISTICS HANDBOOK
THE NORMAL CURVE
1 2 3 4 5 6 7 8 9 10 11
Q3
Q2
Many data just happen to fit to a normal distribution curve, also called a bell curve or,
more technically, a gaussian curve. If your data fits this type of distribution, you can make
some predictions using your data and the mathematics behind this curve.
Q1
IQR = Q1- Q3=8.5-3.5 = 5
For a more detailed explanation, visit this site:
https://stattrek.com/statistics/dictionary.aspx?definition=interquartile%20range
A histogram is a way to visualise data in a sample.
It is essentially a bar chart with categories for the
measured values (for example 1-10, 11-20, 21-30)
on the x-axis and frequency (the number of data
points in that category) on the y-axis.
A histogram can be useful to show if your data are
normally distributed, that is they generally look like
a bell curve. (See section on the normal curve) It
may be important to show this before you can use
some types of hypothesis tests.
Frequency
HISTOGRAMS
Your histogram might not show a normal curve.
The data might be skewed to one direction, or even
show several peaks. These might be important aspects
to bring up in your evaluation, and might help you
choose what hypothesis test, if any, you can use.
Normal
Right-skewed
The standard, ‘bell’, or gaussian curve
shows the pattern of how normally
distributed data spreads around the
mean. If your histogram looks like this,
your data is probably
normally
distributed.
Age
The shaded areas under the curve
represent the proportion of data points
that will likely be found in this section of
the curve. The x-axis shows standard
deviation distances with the mean at 0.
The area under the normal curve shows how many data points are likely to be found in any
given range. 68% of data points will, on average, be within one standard deviation of the
mean, and 95% will fall within 2 standard deviations. This is helpful in predicting
probabilities, and this type of math is the basis for hypothesis tests.
This histogram shows the
distribution of age in a
sample. It appears to be
slightly right-skewed.
The normal distribution curve is one of many used in statistics, but it is the most common
shape you will likely encounter.
Left-skewed
If your data appear to be normally distributed; that is, your histogram appears to have a
normal curve shape, then you may be able to use some hypothesis tests that require
normal data as a prerequisite.
These are some shapes of histograms you might encounter.
Here is a deeper explanation of histograms and how they are made by hand:
https://youtu.be/4eLJGG2Ad30
Here is a longer explanation of the different shapes you might encounter in histograms:
https://youtu.be/Y53_8WRrPzg
9
10
Page 63
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
Here is an example of a well-organised table:
DATA TABLES
Once you have boiled your data down into some tangible values, you will need to present
the raw data and your descriptive statistics in well-organised tables. Designing data tables
is an art form all its own. A few points might help you make yours beautiful.
Table 1: The height of 15 Z. mays plants after growing for 30 days at different fertiliser
concentrations in three different field sites.
Plant height 30 days after germination (+/- 1mm)
Concentration
of fertiliser in
TITLE
Be sure that each table has a meaningful and descriptive title (not just ‘table 1’). With
multiple tables, it is usually a good idea to number them (hint: check that your numbers
are right before you hand in a draft!) so that you can refer to them easily in your text.
LABELS
Your data columns need proper labels including:
Standard
Field site 1
Field site 2
Field site 3
Average
0.10
345
330
404
360
0.20
442
410
430
427
16
0.30
510
470
550
510
40
0.40
580
530
603
571
37
0.50
200
130
240
190
56
soil (+/- 0.10
deviation
mg/kg)
39
• A clear descriptive title of what data is listed in the column,
• The appropriate units of those values, and
• The measurement precision of those values.
DATA
The data itself should have the correct number of significant figures to reflect the precision
of the data (see these links for help with significant figures). Be careful not to show more
precision (more significant figures) in an average. You will likely need to format the cells of
your table to show the appropriate number of digits, since the trailing zeros disappear
otherwise. If you have very large or very small values, simply use scientific notation.
SUMMARY STATISTICS
You may want to include your averages and standard deviations right in the table with your
raw data. If you have a lot of data, or it is relatively complex, you might want to create a
separate data table of your summary statistics. You should use whatever you think will
help your reader see the data best.
FORMATTING
It is usually a good idea, if possible, to present your data table on one page. Having the
first half of a table at the end of one page, with the last half continuing on the next makes
it very hard to get an overview of the data. Also, try to size your columns carefully to fit
them on the page, but not to muddle the titles.
11
12
Page 64
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
Once you have presented your data in tables, you will need to make it more readily visible
to your reader. It is important to choose the right graph for the type of data you are
presenting. The formatting and labelling of the graph is also important. If done well,
graphs should show the reader the answer to your research question at a glance.
FORMATTING AND L ABELLING
Generally speaking, the same rules for tables apply to graphs. Be sure that each graph has a
clear and descriptive title, that the axes are labelled in the same way that the columns of
the corresponding tables are labelled. Make sure that the axes are scaled so that the data
fill the graph, and that the scale numbers reflect the same level of precision as the data.
Also, make sure that the independent variable (that you changed or defined on purpose) is
on the x-axis, and that the y-axis shows your dependent variable (measured result). It is
almost always best to graph your processed data, and not the raw data, unless there is
something important you want to show the reader about your raw data.
BAR GRAPHS
Bar graphs are used to represent numerical data (y-axis) from different categories (x-axis).
Bar graphs of averages should have error bars showing standard deviation, or some other
measure of spread. Somewhere on the graph or in its caption you need to declare what the
error bars represent. Be sure that you are using the unique standard deviation values that
you calculated in your tables, and not the automatic values that some softwares apply
(incorrectly).
LINE GRAPHS
Line graphs and scatter plots are often confused for each other. Line graphs show straight
lines connecting the dots of the data points. This is to represent the fact that line graphs
show multiple measurements of the same thing. The straight line is an assumed linear
change of that measured value between measurements. Therefore, only use a line graph if
you are tracking the change of something.
Figure 2: Global average temperature from 1880 - 2000
14.50
14.42
Average global temperaure (+/-0.01°C)
GRAPHICAL TECHNIQUES
14.34
14.26
14.18
14.10
14.02
13.94
13.86
13.78
13.70
Figure 1: The average growth of cress seeds after growing for 4 days under
different coloured light. The error bars represent one standard deviation.
1870
1890
1910
1930
1950
1970
1990
2010
Year
Average growth of cress plants after 4 days
(+/-1mm)
40
SC ATTER PLOTS AND CORREL ATION
Scatter plots are used to compare numerical values on both axes. If both your independent
and dependent variables are numerical measurements, this is probably the type of graph
you should use. Each dot on the graph represents a data point, and this can show trends in
the data. Usually this type of investigation is looking for some sort of relationship between
the two variables. You will need to start with this type of graph to look for correlations, or
in order to perform interpolations or extrapolations.
30
20
10
0
If you graph average values, you will need error bars to show the spread of the data (see
bar graphs). Be sure, however, to use all data points, not just averages, to calculate an R2
value.
Red
Blue
Green
Yellow
Orange
Color of light applied to growing plants
13
14
Page 65
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
Figure 1: This graph shows a strong positive correlation between the height and
age of the sampled trees.
32
y = 0.7334x - 2.8792
The line of best fit or trend line is chosen by the computer to be as close as possible to all
of the data points. It is an approximation of the linear trend in the data. The closer all of
the data points are to the trend line, the stronger the correlation.
R² = 0.9189
The degree of correlation is measured by the correlation coefficient, r or R, more
technically called the ‘Pearson product-moment correlation coefficient’. This value ranges
from -1 for a set of data that aligns perfectly to a line with a negative slope, to 1, for a set
of data points that align perfectly to a line with a positive slope. The closer the r value is to
either -1 or 1, the stronger the relationship between the two variables.
Height of tree (+/-2m)
25.6
19.2
12.8
r= 1
6.4
0
0
10
20
30
40
50
Age of tree (+/-1year)
CORREL ATION
A correlation is a relationship between two numerical variables. A correlation can be
positive or negative:
• Positive correlation: As the independent variable increases, the dependent variable also
increases.
• Negative correlation: As the independent variable increases the dependent variable
decreases.
Positive correlation
Negative correlation
No correlation
15
0.8
0.3
0
-0.3
-0.8
-1
Alternatively, correlation can be reported as the coefficient of determination, r2 or R2.
This is simply the correlation coefficient squared, which therefore always has a positive
value between 0 and 1. This value is defined as the proportion of the variance in the
dependent variabel that can be predicted by the independent variable. For example, with
an r2 value of 1, all of the data points align perfectly. That means that the value of the
dependent variable can be predicted with 100% precision for any value of the independent
variable. For an r2 value of 0.8, 80% of the variance in the dependent variable is predicted
by the independent variable, and therefore values of the dependent variable can be
predicted with 80% precision given any independent variable. Predicting the values of
variables based on a correlation is called extrapolation or interpolation.
In order to make claims about a linear correlation, it is important that the data show a
linear trend. If the data are not linear, or not expected to be linear, it is not appropriate to
compare them to a trend line! Here are some examples of when linear regression is not
appropriate:
16
Page 66
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
• Enzyme activity is expected to increase logarithmically as temperature increases, then
peak at the optimum temperature for that enzyme, then drop sharply as the enzyme
denatures at higher temperatures. A graph comparing temperature and enzyme activity
might therefore look something like this:
Enzyme activity
50
Interpolation is using the trend line to predict values within the range of your data.
Extrapolation is expanding the trend line beyond the data to make predictions outside of
the range of data. The further the predicted value is from the measured values, the less
reliable the extrapolated value will be.
40
This shape of a graph should not
be compared to a line.
30
20
10
0
0
20
40
60
80
Temperature (+/-1°C)
• The rate of a chemical reaction slows over time as substrate is used up. The shape of the
curve produced is predictable and depends on the type of reaction. The graph of the
concentration of the product over time might look something like this:
40
This shape of a graph should not be
compared to a line. Instead, a Spearman
Rank test can be performed.
30
20
An example of interpolation is determining the osmolarity of a tissue. Suppose you
measured the rate of osmosis in potato tissue in various concentrations of sugar solution.
Your data were linear and looked like this:
10
0
3.00
0
30
60
90
120
Time (+/-1s)
When interpreting r or r2 values, be sure to be realistic about the strength of the
correlation. An r2 value of 0.3 may or may not indicate any kind of relevant relationship. If
you want more certainty about whether your correlation is statistically significant, you
should consider using a Pearson’s R test for correlation or a Spearman’s Rank test. You can
read more about these tests in the section on hypothesis testing.
EXTRAPOL ATION AND INTERPOL ATION
If you have determined a strong linear relationship in your data, you can use this data to
make predictions. You can use the equation of the line of best fit to calculate the expected
value for an unknown.
17
Change in mass of potato tissue after 1h
(+/-0.01g)
Concentration of
product (+/- 0.1M)
50
An example of extrapolation is using current trends in climate change to make predictions
about how the planet’s climate will continue to change in the future. This is how
climatologists make predictions about how warm the earth might be in the coming
decades.
y = -7.1143x + 2.073
R² = 0.9659
You must use all of the
data points (not just the
averages!) in order for your
software to accurately
calculate R2.
2.00
1.00
0.00
-1.00
0.29
-2.00
0.00
0.10
0.20
0.30
0.40
0.50
Concentration of sucrose solution (+/-0.01M)
18
Page 67
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
The high R2 value of 0.97 shows that the data have a strong linear correlation. The
negative slope value of -7.1143 shows that the relationship is a negative correlation.
Because the trend is very strong, the equation for the line can be used to make predictions.
You were asked to determine what concentration would be isotonic to the potato tissue,
that is, at what concentration no net osmosis would occur. At this concentration the
change in the mass of the potato tissue would be zero. To find the corresponding
concentration, you can substitute zero for y (the change in mass) in the trend line
equation, and solve for x (the concentration):
HYPOTHESIS TESTING
y = − 7.1143x + 2.073
→
0 = − 7.1143x + 2.073
−2.073 = − 7.1143x
→
−2.073
=x
−7.1143
x = 0.2914
This value of 0.29 M is where the trend line crosses the x axis, and it is the osmolarity, or
isotonic concentration of the potato tissue. It must be rounded to reflect the precision of
the measurements that were used to calculate it. This process could be repeated to predict
any given change in mass or any concentration within the range of the data.
OTHER GRAPHS
Though the graphs listed above are the most likely you will need, there are of course many
other types of graphs. Here are two others that you might consider:
Pie charts show the breakdown of a group into its parts, usually percentages. The
percentages should add up to 100. Avoid too many categories, as the chart can quickly
become difficult to read.
Radar charts can be used to show many different attributes at once, and compare these
between locations or individuals.
The goal of an experiment or investigation is to answer a specific question. The data
should make it clear what the answer to that question is. Often, due to the uncertainty
inherent in data, the answer may not be entirely clear. It may look like there is a difference
between two groups, but the difference might only be due to chance. It may appear that
there is a correlation between two variables, but the sample may have been a fluke.
Hypothesis testing allows you to determine how sure you are of the answer, and the
likelihood of the observed pattern being due to chance.
A hypothesis test requires that you make an assumption, and calculate the probability of
this assumption being true. This assumption is called the null hypothesis, H0. If this null
hypothesis assumption can be shown to be very unlikely, then you can conclude instead
that the alternative hypothesis, HA, is true. Despite the naming, these hypotheses are
different than your experimental hypothesis, that is, your reasoning about what you think
will happen in your experiment. You always need to declare and explain an experimental
hypothesis in the exploration portion of your work. You only need to declare null and
alternative hypotheses in the context of your hypothesis test, if you choose to use one.
This should be included in your explanation of the data analysis.
A hypothesis test generates a test statistic. The value of this test statistic gives you
information about the likelihood of the null hypothesis being true. That test statistic can
then be compared to table of critical values that it must be higher or lower than in order to
conclude a statistically significant result.
Usually this process is simplified, and a p-value can be calculated based on the test
statistic. The p of p-value stands for probability, and it is the probability of the null
hypothesis being true given your data. It is always a value between 0 and 1 (i.e. 0 and
100%) . If the p-value is low enough, then it is very unlikely that the null hypothesis is
true, and it can safely be rejected. When the null hypothesis is rejected, the alternative
hypothesis can be concluded, and there is a statistically significant result.
The p-value is compared to the alpha value. For our intents and purposes you will use an
alpha value of 0.05. This is the threshold probability below which you determine that the
null hypothesis is too unlikely. That is, if the probability of the null hypothesis being true
(p-value) is less than 5%, then you should conclude that it is too unlikely to be reasonable
and reject the null hypothesis. For an example of this process, read the section on ttesting.
If the p-value is above the alpha threshold of 0.05, then you must ‘fail to reject the null
hypothesis’. This is different than accepting the null hypothesis! You don’t have enough
19
20
Page 68
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
evidence to conclude that the null hypothesis is true. Instead you simply ‘fail to reject’ and
conclude that you cannot be sure whether the observed result is due to random chance or
a real effect.
For example, if a test gives a p-value of 0.2, there is a 20% chance that the null hypothesis
is true given your data. It would not be reasonable to conclude, then, that the null
hypothesis is true, as there is only a 20% chance of this being the case. You also cannot
rule it out entirely, since 20% is a significantly high likelihood. Therefore you simply ‘fail
to reject the null hypothesis’.
Suppose you want to find out if dandelions (T. officinale) grow to different heights in two
different types of soil. In your experiment you measure the average growth of the
dandelions in each of two soil types.
Figure 1: The average maximum height of 16 T. officinale plants grown in two
different soil types. Error bars represent one standard deviation.
Maximum achieved height of
T. officinale plants (+/-0.1cm)
TESTING FOR DIFFERENCES
The t-test is used when the data can be assumed to be normal and the sample sizes are
relatively large (more than 10 measurements). It might be a good idea to make a
histogram to see if the data appear to be normal, but at the very least you should state that
you assume the data to be normally distributed, and why you think it is.
If the assumptions for normality are met for the t-test, but you have more than two
groups, you will need to perform an ANOVA (analysis of variance) test to see if the
variability between the groups is due to chance or some real effect.
If it is not safe to assume that the data are normally distributed, you have small samples,
or your data are ordinal, but not numerical, then you can make a comparison between two
groups using the Mann-Whitney U Test instead. This test is less likely to find a difference
if there is one, but it is safer to use if the prerequisites for a t-test are unclear or not met.
THE T-TEST
The t-test assumes a null hypothesis that there is no significant difference between the
groups (any observed difference is due to chance), then calculates the probability of that
hypothesis given your data.
H0: There is no significant difference between the two groups
HA: The observed difference between the groups is statistically significant, and not
likely due to chance.
21
17.0
Average maximum height of plants (+/-0.1cm)
Often an investigation aims to find differences between groups. The t-test, ANOVA, and
the Mann-Whitney U test are different ways to determine whether observed differences are
statistically significant, or just due to random chance.
12.8
8.5
4.3
0.0
Soil a
Soil b
Soil a
Soil b
8.0
15.5
6.3
14.7
9.1
14.5
13.2
12.2
12.0
10.1
6.3
15.0
10.0
12.1
11.0
13.2
12.1
16.0
9.8
14.2
8.5
13.1
12.2
9.9
9.7
17.8
10.1
10.3
13.2
16.4
10.3
19.0
Average
10.1
14.0
Standard
deviation
2.1
2.7
You notice a difference between the groups. The plants in soil b have grown taller on
average than the plants in soil a. Since the error bars are overlapping, it is hard to say
whether this observed difference is due to chance, or whether the two are really different.
Therefore you decide to perform a t-test to find out.
First, you need to determine whether the data are normally distributed, so you make a
histogram to see.
22
Page 69
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
0.05. That means that if the probability of the null hypothesis being true sinks below 0.05
(5%), then it can be rejected and the alternative hypothesis accepted.
Figure 2: A histogram of the data shows that
it appears to be normally distributed.
Soil a
In the case of these data the p-value is 0.00008. This is well below the threshold level of
0.05, and therefore the null hypothesis can be rejected. The observed difference between
the two soils is not due to random chance, it is a statistically significant difference.
Soil b
5
Frequency
4
A N OVA
3
The ANOVA test should be used if you have more than two groups to compare. Though
you could theoretically perform many t-tests between each of the possible combinations,
this is inefficient, and mathematically risky. Every time you perform a t-test, there is a
small probability that your difference was in fact due to a random fluke, and not a real
difference. If you perform many t-test, the likelihood of making such an error increases.
2
1
0
6.0-7.9
8.0-9.9
10.0-11.9
12.0-13.9
14.0-15.9
16.0-17.9
18.0-19.9
Height of plants (+/-0.1cm)
The ANOVA test assumes the following hypotheses:
Because the data appear to be normally distributed, and the sample sizes are sufficiently
large (n=16), you can proceed with the t-test.
H0: The groups are all the same
Using the T.TEST() function of your spreadsheet software, you enter the following
information:
HA: At least one of the groups appears to be different than the rest. The variability
between groups is not likely to be due to chance.
The ANOVA test produces a p-value that can be interpreted in the same way as in the ttest. If the p-value is below the threshold of 0.05, then the null hypothesis can be rejected
and the alternative hypothesis concluded. It is not clear from the ANOVA test what groups
are different from each other, but instead that the variability between the groups is not due
to chance.
=T.TEST(dataset from soil a, dataset from soil b, 2, 2)
This decimal may
or may not be
required in your
software.
These are the lists
of raw values, not
the averages.
The two 2’s in the syntax tell
the software what kind of ttest to perform. You want a
‘two-tailed’, ‘non-paired’ test.
You can learn how to perform the ANOVA test in Excel or LibreOffice here:
The T.TEST() function of your spreadsheet software returns the p-value for the test. The pvalue is the the probability of H0 being true given your data.
Excel: https://youtu.be/qQSQr_JldyY
LibreOffice: https://youtu.be/TxTKq4W8qX8
You can watch a tutorial video on the t-test in Excel here:
THE MANN-WHITNEY U TEST
https://youtu.be/DPNUpldVC4M
As the p-value decreases, the likelihood of the difference being due to random chance also
decreases. Eventually, the p-value is so small, that it is no longer reasonable to assume that
the null hypothesis is true, and therefore the null hypothesis can be rejected. The most
commonly used threshold level (also called the alpha value) for rejecting the p-value is
23
The Mann-Whitney U test is mathematically very different than the t-test, but achieves a
similar goal of finding out whether an observed difference is statistically significant.
Instead of using the actual values of the measurements for the comparison, it simply
compares the rank order of the values. This is somewhat analogous to the difference
between a mean and median average with the t-test being similar to the mean and the
24
Page 70
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
Mann-Whitney U similar to the median. This type of hypothesis test that does not rely on
the actual value of the measurements is called a non-parametric test.
The null and alternative hypotheses are the same as for the t-test:
If these conditions are met, you can continue with the test.
The null and alternative hypotheses are as follows:
H0: There is no correlation in the data. The observed trend is due to random chance.
H0: There is no significant difference between the two groups
HA: The observed trend is a statistically significant correlation.
HA: The observed difference between the groups is statistically significant, and not
likely due to chance.
Although you can painstakingly calculate the Mann-Whitney U statistic by hand, with
some help from your spreadsheet software, this is not a requirement of the IB. Instead
simply enter your two sets of data in this online calculator:
In your spreadsheet software, enter the formula to calculate r, the Pearson correlation
coefficient:
=PEARSON(dataset of independent variable, dataset of dependent variable)
This formula calculates the r value that then needs to be compared to a critical value table
(see critical value tables here).
https://www.socscistatistics.com/tests/mannwhitney/default2.aspx
The calculator gives you the p-value for the test, which you compare to the alpha threshold
of 0.05 as in the t-test. If the p-value is below 0.05, you can reject the null hypothesis and
conclude that there is a statistically significant difference.
TESTING FOR CORREL ATION
If your investigation intends to look for a relationship between two variables, it might be a
good idea to test whether the correlation your data suggest is statistically significant or
likely to be due to chance. A correlation test does just that. These tests work in a similar
way to tests for differences. The two most relevant tests are the Pearson productmoment correlation test also called the Pearson correlation test and the Spearman rank
correlation test.
THE PEARSON CORREL ATION TEST
The critical value table shows what values of r are significant. The strength of the test
depends on the number of data points included (n), so the critical value also changes with
n. Keep in mind that one data point has two measurements. If you were comparing, for
example, height and weight of plants, and measured height and weight of 12 plants, then n
would be 12, not 24. You simply need to compare the r value that you calculated to the
critical value corresponding to the number of data points you used. If the absolute value of
r is greater than the critical value, then the correlation is statistically significant and not
likely to be due to chance.
For example, if your r value was -0.65 and you had 10 data points, you would compare that
r value to the critical value 0.521 from the critical value table and conclude that the
absolute value of r is greater than the critical value. Therefore the correlation is statistically
significant.
SPEARMAN RANK CORREL ATION TEST
The Pearson correlation test, similar to the t-test, requires the data to meet some
prerequisites. In order to run this test, the following conditions must be true:
The Spearman rank correlation is denoted by the symbol ρ (rho) or rs. Analogous to how
• The data are numerical for both variables.
• The data are paired, that is, there are two measurements or values for each data point,
the dependent and independent variables.
the Mann-Whitney U test compares rank instead of the actual values of the data, the
Spearman rank test determines a correlation in the data by looking at the rank order of the
data instead of its actual values. Use this test instead of the Pearson correlation test if any
of the following are true:
• The data follow a linear trend.
• The data are not numerical but are ordinal.
• The data can be assumed to be normally distributed for both variables.
• The data are not normally distributed.
• There are no obvious outliers in the data set.
25
26
Page 71
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
the distance between each location (eg. tree) and it’s nearest neighbour. The nearest
neighbour index, (NNI or Rn) is then calculated according to the following formula:
• The data do not appear to have a linear trend, but do trend either positively or
negatively.
• There are apparent outliers in the data.
The null and alternative hypotheses are the same as for the Pearson correlation test:
NNI = 2D̄
H0: There is no correlation in the data. The observed trend is due to random chance.
n
A
Where D̄ is the average nearest neighbour distance,
HA: The observed trend is a statistically significant correlation.
n is the number of observations, and
To calculate the test statistic, simply enter your data in the calculator at this site, and
interpret the p-value as in the other tests:
A is the total area studied.
The NNI value can indicate that the points are clustered, random, or ordered, depending on
its value:
https://www.socscistatistics.com/tests/spearman/default2.aspx
NNI = 0 : The points are completely clustered
OTHER STATISTIC AL TESTS
NNI = 1.0 : The points have a completely random distribution
Though making comparisons between groups and determining correlations between
variables are the most common statistical tests, there are many other ways to test data for
significance. The chi-squared test for goodness of fit and the nearest neighbour analysis are
two that you might need for biology and geography respectively.
NNI = 2.15 : The points are distributed uniformly.
0
1.0
2.15
clustered
random
uniform
THE C HI-SQUARED TEST
This test determines whether data fit a pattern or model. The data for the test needs to be
categorical. One of the simplest versions of this test is to test for an association between
species, that is whether the location of some species of immobile organism associates with
the location of another species. A chi-squared test can also be used in genetics to
determine if the frequency of genotypes matches the expected ratios. For more information
about this type of test and how to calculate and interpret the chi-squared value, refer to
your oxford biology textbook on pages 215 (association between species) and 453
(genotype ratios).
Here is a demonstration of how you can use Excel to calculate chi-squared:
https://youtu.be/o0VhMWeotFg
N E A R E S T N E I G H B O U R A N A LY S I S
In geography, the nearest neighbour analysis can be used to determine if the spacing
between points is random, clustered, or ordered. First, the data is collected by measuring
Given the number of data points, you can compare the NNI value to the critical value table.
If NNI is below the number for clustered points, then there is statistically significant
clustering. If it is above the value for uniformity, then the points are statistically
significantly uniform. If it lies between the two values, the
Tree
Nearest
Distance
points are randomly distributed.
1
2
1.1
2
1
1.1
3
2
1.3
4
7
0.4
5
3
1.2
6
7
1.0
7
4
0.4
8
9
2.0
8
2.0
9
D̄
=
For example, the following data was collected by measuring
the distances between trees in a 36m2 park:
1+
2+
4+
7+
5+
6+
1.2
n= 9
27
8+
3+
9+
A = 36m2
28
Page 72
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
C R I T I C A L VA L U E TA B L E S
Since n= 9, A=36m2, and D̄ = 1.17m The NNI value is therefore calculated as:
NNI = 2D̄
n
= 2 ⋅ 1.2
A
Pearson r
9
= 1.2
36
For n= 9, the critical value table gives 0.713 as the limit below which the points would be
considered clustered, and 1.287 as the upper limit, above which the data would be
considered ordered. You can therefore conclude that the trees are randomly dispersed.
29
n
Critical
value
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
26
31
36
41
46
51
61
71
81
91
101
0.988
0.900
0.805
0.729
0.669
0.622
0.582
0.549
0.521
0.497
0.476
0.458
0.441
0.426
0.412
0.400
0.389
0.378
0.369
0.360
0.323
0.296
0.275
0.257
0.243
0.231
0.211
0.195
0.183
0.173
0.164
Nearest neighbour index
Critical values
n
clustered
uniform
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
50
60
70
80
90
100
0.392
0.504
0.570
0.616
0.649
0.675
0.696
0.713
0.728
0.741
0.752
0.762
0.770
0.778
0.785
0.792
0.797
0.803
0.808
0.812
0.817
0.821
0.825
0.828
0.831
0.835
0.838
0.840
0.843
0.846
0.848
0.850
0.853
0.855
0.857
0.859
0.861
0.862
0.864
0.866
0.867
0.869
0.870
0.872
0.878
0.889
0.897
0.904
0.909
0.914
1.608
1.497
1.430
1.385
1.351
1.325
1.304
1.287
1.272
1.259
1.248
1.239
1.230
1.222
1.215
1.209
1.203
1.197
1.192
1.188
1.183
1.179
1.176
1.172
1.169
1.166
1.163
1.160
1.157
1.155
1.152
1.150
1.148
1.145
1.143
1.141
1.140
1.138
1.136
1.134
1.133
1.131
1.130
1.128
1.122
1.111
1.103
1.096
1.091
1.086
30
Page 73
IB STATISTICS HANDBOOK
IB STATISTICS HANDBOOK
HOW TO USE
SPREADSHEET SOFT WARE
You will need to spend some time learning how to use your brand of software on your
platform, as they all differ somewhat. Excel©(subscription based) and LibreOffice©
(freeware) are both good options, but you could also use Numbers© on MacOS, or Google©
Sheets, though the latter has some significant limitations. Many of these calculations can
also be performed on a Tinspire© handheld. Searching the web, or using your software’s
help function will usually yield quick answers to tricky problems. Here are some tips and
resources that might help you on your way:
Tip: Be sure you know whether your software expects a decimal ( . ) or a comma ( , ) as a
separator. If you use the wrong one, the computer does not recognise your data as
numbers, but instead treats it as text which causes all calculations to fail. Use the 'search
and replace’ function of your software to change all of them at once.
Tip: Use this site to find the appropriate function in your software’s language:
http://www.excelfunctions.eu
MORE HELP AND FURTHER
READING
For help reviewing how to use significant figures appropriately, watch these videos:
An introduction to significant figures: https://youtu.be/eCJ76hz7jPM
Rules to determine significant figures: https://youtu.be/eMl2z3ezlrQ
For lots of in-depth information on the geography Internal Assessment, visit these pages:
https://www.thinkib.net/geography/page/22606/ia-student-guide
https://sites.google.com/site/geographyfais/fieldwork
For more help with biological statistics for the IA, visit this site:
The Moodle site 7AB Tabellenkalkulation has guides for performing simple calculations
and making diagrams in Excel© and LibreOffice© here:
https://moodle.tsn.at/course/view.php?id=36089
https://www.biologyforlife.com/statistics.html
For a very useful handbook of basic statistics, look for a copy of this book:
Methods of Statistical Analysis of Fieldwork Data. St. John, P. and Richardson, D.A.
Geographical association 1996.
At Mr. Schauer’s youtube channel you can find a handful of videos on data analysis:
bit.ly/mrschauersyoutube
Here are some instructions on how to make a histogram in Excel:
https://support.office.com/en-us/article/create-a-histogram-85680173-064b-4024b39d-80f17ff2f4e8
For more information on calculating quartiles and the inter-quartile range using excel, visit
this site:
https://www.statisticshowto.com/probability-and-statistics/interquartile-range/
#IQRExcel
31
32
Page 74
Using statistics in IB biology and ESS 2014
Contents
Types of questions and types of data
Table of Contents
Types of questions and types of data
2
Why is this important?
2
Types of questions
2
Types of data
2
The normal distribution
3
Presenting data
3
Which statistical test?
How to process your data in Excel
Using statistics in IB biology
Why is this important?
It is important, in the PLANNING stages of an investigation, to think about how you will process
the data that you collect. A common mistake is to collect data and then try to decide what you
will do with it. This often results in people realising that they do not have enough data, or that
the data is the wrong type for the statistical test they would like to perform. Worse still, it
sometimes means that people try to do all of the tests they know and then pick the one that was
significant, not for a good scientific reason, but simply because it matches their preconceptions.
This is VERY bad science!
3
5
Types of questions
5
In biology you are usually (but not always) interested in one of 3 types of question;
Mean
5
Standard deviation
5
Is there a significant difference between my sets of data (e.g. Are girls taller than boys? Who are
more intelligent – Brown, blue or green eyed people?)?
95 % confidence interval
5
Descriptive statistics (Mean, Standard deviation, 95% confidence interval)
Null and alternative hypotheses
5
How significant is the difference in my data?
5
Testing for differences between 2 sets of data
6
T tests
6
One or two tailed?
6
Paired or unpaired?
6
Homoscedastic or heteroscedastic?
Is there a relationship between my 2 sets of measurements (e.g. Do taller people have bigger
feet? Does increasing light intensity cause an increase in photosynthesis?)?
Do the observed frequencies I have counted match the expected theory (e.g. The probability of
rolling a 6 on a dice is
My sister has rolled 13 sixes in her last 36 rolls. Is the dice ‘dodgy’? )?
The type of question you are asking has an effect on the type of type of data you need to collect.
This in turn affects the statistics you need to use and the type of graph you will draw.
6
7
Types of data
Enabling the Data Analysis Package
7
Carrying out an ANOVA analysis
7
Data can be continuous (i.e. it is measured on a scale where any value on that scale is possible),
discrete (i.e. there are particular values that are possible and other values that are not possible)
or ranks (i.e. you have data in an order 1st, 2nd, 10th or you have rating scales). You need to
display these types of data differently and do different kinds of descriptive statistics. You also
need to use different statistical tests. It is possible to convert between the two (e.g. grouping
continuous height data into categories ‘short’ and ‘tall’ or ranking people from tallest to shortest)
if you need to process the data in a particular way.
Testing for differences between more than 2 sets of data
Looking for CORRELATIONS between variables.
What does it tell us if data are correlated
9
9
Which test?
9
Pearson test for correlation.
9
Spearman’s Rank test for correlation
9
Testing to see if your observed data matches the expected data
10
What type of data?
10
Calculating Chi Squared by hand
10
Calculating Chi‐squared using Excel
11
The independent variable is the one you have deliberately changed to see what effect this has
on the dependent variable (the ‘results’ you have measured to see if there has been a change).
Control variables are other variables that should be controlled to ensure a fair test. If it is
impossible to control them (e.g. in field work) then they should be measured so that their effect
on the data can be evaluated.
The normal distribution
The normal distribution is frequently
seen in biology.
There is natural
variation in many characteristics as a
result of genes interacting with the
environment. A normally distributed
characteristic
is
one
that
is
symmetrically distributed around the
mean with most values being close to
the mean and very few values being at
the extremes.
If a population is
normally distributed then 50% of values
will be below the mean and 50% above
the mean. Also 68 % of values will be within 1 standard deviation of the mean and 95 % will be
within 2 standard deviations of the mean. Only continuous data can be normally distributed.
Discrete and rank data cannot be normally distributed unless it is ‘transformed’ which is well
beyond what is expected in IB biology!
Presenting data
If the independent variable is discrete (categories) and the dependent variable is continuous then
you should plot a bar graph with the mean of the data and Y error bars showing the 95%
confidence interval or the standard deviation if you have sufficient repeats, or use the range or
if you do not have sufficient repeats.
√
If the independent variable is discrete and the dependent variable is a rank or count data you will
need to decide if a clear table, a bar graph or pie chart is the most appropriate way to display
the data.
If both variables are continuous then it should be plotted as a scatter graph with the independent
variable on the X axis. X error bars should represent the uncertainty in the measurement and the
Y error bars should represent the uncertainty of the mean of the measured data (either 95%
confidence interval or standard deviation if you have sufficient repeats, or use the range or
where N is the number of values, if you do not have sufficient repeats).
√
If you have no repeats or if the uncertainty in the measurement is larger than the variation within
the data then you should use the error in the measurement. (e.g. if you were measuring
temperature to the nearest degree and your repeat measurements gave you a standard deviation
of 0.2 you would use ±1 rather than ±0.2 as it is the larger error.
Page 75
Which statistical test?
The statistical test you choose depends on the type of question and whether your data is from a
normal distribution (parametric) or not (non-parametric). If you are not sure if your data fits a
normal distribution it is better to assume it DOES NOT and use the non-normal distribution (nonparametric) tests. Most continuous variables in living things do follow a normal distribution due
to the interaction of genes and the environment.
The chart on the next page tells you how to decide which type of test you need to use. Although
there are a large number of other statistical tests, it is probably best to avoid collecting data or
to posing a question that cannot be answered using one of the tests shown here unless you have
a real statistical package available to you! In the sections following the chart are instructions on
how to perform each test in Excel
Page 76
How to process your data in Excel
Descriptive statistics (Mean, Standard deviation, 95% confidence interval)
The Mean and Standard deviation and Confidence intervals are easily calculated in Excel using
the following functions;
Mean
=AVERAGE(range)
where range is the range of cells containing the data
Standard deviation
=STDEV(range)
where range is the range of cells containing the data
95 % confidence interval
=CONFIDENCE.T(0.05,st_dev,size)
Use the flow chart above to help you decide which is the most appropriate test for the type of
question you want to answer. Make sure the data you collect allows you to perform the test
you would like to do!
where 0.05 signifies the confidence level (you can use 0.1 if you want the 90% CI etc.), st_dev
is the standard deviation (as calculated above) and size is the number of data points in the
sample.
The standard deviation tells you how variable the data you sampled was. Small standard
deviations indicate that all of the data was close together around the mean. A large standard
deviation suggests that the data varied a lot and were spread out around the mean.
The 95 % confidence interval gives you a range around the mean that has a 95% chance of
containing the true mean of the data.
You should have at least 5 measurements before you try to calculate standard deviation or 95%
confidence interval. Less than this and the value is not really reliable and you should use a
different estimation of the spread of the data such as the range or
.
√
Null and alternative hypotheses
It is important to remember that these hypotheses are for the STATISTICAL TEST and not for
your investigation. You do not need to state a hypothesis as part of your method (although you
should have some idea of what you are expecting to find and why!). When you start to analyse
your data you should say which test you are going to use and what the Null hypothesis (H0) and
alternative hypothesis (HA) are. In general your null hypothesis will be that there is NO significant
difference/correlation in your results and the alternative hypothesis is that there IS a significant
difference or correlation. The statistical tests calculate the probability that the null hypothesis is
true. A high probability value indicates that any differences or correlations appeared by chance,
a low value that the differences or correlations in the data are significant.
How significant is the difference in my data?
Statistical tests are used to give you a probability that your sets of data are the same, that your
data is correlated or that your data fits the expected pattern. They cannot tell you WHY the
difference or correlation does or does not exist!
There is always a chance that the statistics will give you a false positive (i.e. saying there is a
significant result when there is not) or a false negative (saying there is no significance when there
really is). Because of this scientists usually use a 5% cut off when interpreting the results of tests
of difference (there is no good scientific reason for this and it is still a controversial topic) although
some scientists will say their data is significant at the 10% level and others will only accept it as
significant if it is less than 1%.
Generally biologists will use higher significance levels (10% or 5%) than chemists of physicists
as it is much harder to control other variables in biological systems.
As part of your evaluations you will have to decide how significant you think the result of your
statistical test is!
Page 77
Testing for differences between 2 sets of data
T tests
In Excel the T.TEST function gives you the probability of the 2 data sets (that are separate
samples of different individuals) being the SAME. A T test to look for a difference between 2 sets
of data is carried out using the following formula;
=T.TEST(array 1, array 2, tails, type)
Array 1 and 2 are the data sets. See below for discussion of number of tails and type of test.
Data must come from a NORMALLY DISTRIBUTED population for you to perform a T test and
there should be at least 10 measurements in each set (but there do not need to be the same
number in each group!).
One or two tailed?
The number of tails depends on the data. If you have a good scientific reason for believing that
one group MUST be larger than the other then you do a 1 tailed T-test. If the difference in means
could be in either direction, that is you do not have a reason for thinking that one group must be
bigger then you should do a 2 tailed test. If in doubt use a 2 tailed test as you are less likely to
get a false positive.
Examples;
You want to know at what age children stop growing. You measure twenty 16 year olds and
twenty 17 year olds to see if there is a significant difference in height between the two groups.
You know that IF there is a difference then the 17 year olds will be taller as children grow with
age, or stay the same, but never shrink. Because of this you do a 1-tailed T test.
You want to know if boys or girls have a faster reaction time. You test twenty boys and twenty
girls. There is no scientific reason for thinking that boys MUST be faster than girls so in this case
you do a 2-tailed T test.
Paired or unpaired?
A paired T test is used when the 2 sets of data come from pairs of measurements of the SAME
people. This is often (but not always) a measurement before and after a treatment or a period
of time. e.g. if you wanted to know if exercise makes pulse rate increase you would measure the
pulse rate of ten people before and after exercise. The data is in PAIRS (person A before and
after exercise) and it matters who the data was collected from. This makes a PAIRED T test the
correct choice.
An unpaired T test is used when the 2 measurements are from different sets of people. In a test
to see if boys are taller than girls, the people measured for the girl category are completely
different set of people to those in the boy category. In this case an unpaired T test is used.
Homoscedastic or heteroscedastic?
In simplified terms, this is about the variability of data. Generally if your measurements all have
a similar reliability (e.g. measuring the height of people using a stadiometer) then the data will
be homoscedastic. If the some readings are less reliable than others (e.g. estimating bird
populations – the counts of small populations will be more accurate than the estimates for large
populations) then data may be heteroscedastic. If in doubt use the heteroscedastic test as it is
more robust and less likely to give a false positive.
Testing for differences between more than 2 sets of data
It is bad science to do multiple T tests if you have more than 2 sets of data as by doing so you
are increasing the chance that you will get a false positive when there really is no significant
difference. You can solve this by doing an ANOVA (ANalysis Of VAriance). This is similar to a T
test. To do this in Excel is simple, but you need to have the Data Analysis package installed and
activated before it can be used.
Page 78
Below is the output of the data for marks in the General Knowledge quiz of randomly selected
P6, S3 and UB students.
Enabling the Data Analysis Package
To enable this click on ‘File’ in any Excel document, choose ‘options’ from the menu at the left
hand side; a new window will pop open. Click on ‘Ad-ins’, in the drop down menu at the bottom
of the window select ‘Excel Add-ins’ and click ‘GO’ near the bottom right of the page. Check the
‘Analysis ToolPak’ option and click OK. This should activate the package. You should be able to
see it as an option when you click on the Data tab in Excel.
Carrying out an ANOVA analysis
For example you have conducted IQ tests on blue, brown and green eyed people. Rather than
do 3 T-tests (blue v brown, blue v green, green v brown) you need to do an ANOVA. Your data
may be grouped in rows and look something like this…
To perform an ANOVA Click on the ‘Data’ Tab and then
select ‘Data analysis from the right hand side of the menu.
A pop-up will appear. Choose ‘Single factor ANOVA’ (you
are only changing ONE factor – eye colour). In the box that
appears select ALL the data including headings, Click that
the data are grouped by column and tick the ‘labels in the
first row’ box. Tell it to output the data generated into a
new worksheet and click ok.
The output looks complicated but is actually easy to
understand… It should look something like this;
In this case the P-value is much greater than 0.05 so we can conclude that there is no
significant difference between the IQ of people with different coloured eyes.
From this you can see the P-value is 8.2 x 10-19 which is MASSIVELY lower than 0.05. This tells
us that the year group DOES have a significant effect on the score in the GK quiz. Looking at
the data you can clearly see who has the highest score and who has the lowest score.
Looking for CORRELATIONS between variables.
Page 79
What does it tell us if data are correlated
Spearman’s Rank test for correlation
Correlations tell us if 2 things vary together. This could be because one is causing the other or
it might be that the two things are being affected by a third unknown variable or a complex set
of changes
This test uses the same formula as the Pearson correlation BUT the test is carried out on RANKED
data, not the raw data. You can rank the data in 2 ways… If you have small amounts of data
you can rank it by hand – the highest value in each set of data receives the value 1, the next
highest 2 etc. If 2 pieces of data share the same value then their ranks are averaged.
e.g. lung cancer is correlated with smoking because chemicals in tobacco smoke CAUSE tumours
to form; if you reduce the amount you smoke you can reduce your risk of cancer.
Height is correlated with shoe size because the genetic and environmental factors that affect your
height also affect all of your other bones. If I fed you more protein and calcium you would get
taller AND your feet would grow larger. However if I stretched you, your feet would not magically
stretch too – there is no cause and effect.
Climate change has increased as piracy has fallen; this is not because pirates reduce global
warming but rather because both are affected by the complex changes in society over the last
200 years! If we introduced more pirates there would be no reduction in climate change!
Which test?
If you have a lot of data it may be quicker to do it using the following formula in Excel
=RANK.AVG(number,ref,order)
Where number = the cell reference of the value you are putting into order (e.g. A2),
ref = the cell references of the range of data you are comparing it to (e.g. $A$2:$A$11 – the
dollar signs give you the ‘absolute’ cell reference so that when you copy and drag the formula
down it still looks at the same reference – just press F4 when your data is selected to do this!)
and order = 0 for ascending or 1 for descending order.
e.g. when testing to see if there is a relationship between height a nose size the formula to rank
the Height would be =RANK.AVG(A2,$A$2:$A$11, 0) and the data may look like this…
Testing to see if data is correlated is easy to do in Excel. There are two types of test Pearson
Correlation and Spearman’s Rank. To decide which one to do you need to think about your data.
If it is CONTINUOUS data that is normally distributed in the population then you can use a Pearson
test. If not then you need to use a Spearman’s rank.
Pearson test for correlation (r).
To carry out a Pearson test for correlation you use the formula PEARSON(array1, array2) where
array one is your first set of data and array 2 is your second set of data. The value returned is a
number between -1 (for very strong negative correlation) to 1 (for very strong positive
correlation). A value of 0 means no correlation.
You can test the significance of your value by comparing your value with the critical value for the
sample size you have tested. This will allow you to decide if the correlation you have measured
is real or has occurred by chance.
To do this decide on how many degrees of freedom your data has (in the case of correlations
df = N-2 where N is the number of data points you have.). You look this up in a table of critical
values for the Pearson correlation coefficient and see if your value is larger than the value given
for the 0.05 significance level. If it is you can reject the null hypothesis that the correlation has
occurred by chance and accept the alternative hypothesis that there is a real correlation
between the 2 measurements.
You can find tables for the Pearson correlation coefficient here;
http://www.gifted.uconn.edu/siegle/research/correlation/corrchrt.htm
e.g. in an experiment to see if arm span was related to height 7 people were measured and a
correlation coefficient of 0.973 was calculated. The number of degrees of freedom are n-2 = 5.
The critical value at p=0.05 for 5 df is 0.754. Your value is higher than this so the correlation
IS significant. You report your findings in the following way;
A Pearson test for correlation was carried out and found the following; r(5) = 0.973, p<0.05
i.e. r(number of degrees of freedom)= calculated coefficient, probability value used
As for the Pearson Correlation a value close to 1 or -1 signifies very strong correlation and close
to 0 signifies no correlation. You can use the tables for the Pearson coefficient to determine the
significance of the values in the same way as for a Pearson correlation.
Page 80
Testing to see if your observed data matches the expected data
The CRITICAL value for χ2 is 7.82. This is MUCH
lower than our calculated value therefore we
REJECT the null hypothesis that there is no
preference and ACCEPT the alternative
hypothesis that there IS a significant difference
in the number of people choosing different fizzy
drinks. Looking at the data it is obvious that
this difference is a preference for Inca Cola.
What type of data?
This test if for testing if the observed FREQUENCY of something matches the null hypothesis
‘expected’ values. It can only be performed when you have counted something.
Your Null hypothesis is USUALLY that there will be no difference in frequency between the groups
(e.g. if I asked 100 people to choose a red, green, blue or white t-shirt then my null hypothesis
would be that 25 would choose red, 25 blue, 25 green and 25 white).
If our value had been smaller than 7.82 we
would have had to ACCEPT the NULL
hypothesis and conclude that all the drinks
tested were equally popular.
The only exceptions to this are genetic crosses, where you could be expecting a 3:1 ratio or a
9:3:3:1 ratio, or work with populations that have a known make-up (e.g. if I know that there are
100 girls and 50 boys in a school my null hypothesis for a class of 15 students would be that
there would be 10 girls and 5 boys)
To perform a Chi-Squared test you must have COUNTED something and the expected values
must be at least 5. The EXPECTED values do not have to be whole numbers!
If you do Higher biology they expect you to be able to calculate Chi-squared by hand. If not you
can do it using Excel!
Calculating Chi Squared by hand
For example you want to decide if one soft drink was more popular among P6 students that
others so you gave away soft drinks to 100 P6 pupils and give them a choice of Coca Cola, Inca
Cola, Fanta and Sprite. You recorded the OBSERVED numbers of pupils choosing each drink and
got the following data;
Inca cola
Observe
d
Coca cola
52
Fanta
20
18
Sprite
10
Your Null hypothesis is that no drink is more popular than the others so your EXPECTED values
are 25 for each drink. You first need to calculate the Chi squared value and then look to see if
the critical value for the number of degrees of freedom in the χ2 table is statistically significant.
The formula for Chi squared is as follows;
χ2
∑
Where O = the observed value and E = expected value. It is useful to make a table to show the
calculation process like this.
Drink
Observed
Expected
O-E
(O-E)2
𝑂
𝐸
𝐸
Inca Cola
52
25
27
729
Coca cola
20
25
-5
25
1
Fanta
18
25
-7
49
1.96
Sprite
10
25
-15
225
𝑂
𝐸
2
29.16
9
41.12
𝐸
The number of ‘degrees of freedom’ is the number of categories minus 1. so χ2 = 41.12 (3 d.f).
You look up the critical value for a probability of 0.05 at 3df in the tables
Calculating Chi‐squared using Excel
You need a results table that contains the Observed data and Expected data. Use the formula
=CHISQ.TEST(actual_range, expected_range) where ‘actual range’ is the observed data and
‘expected range’ is your expected data. It will work whether your data is arranged in rows or
columns. The number this formula returns is the PROBABILITY that the null hypothesis is true
(i.e. that the observed and expected values are the SAME). As normal if this is less than 0.05
then the test shows a SIGNIFICANT difference and the observed and expected values are
DIFFERENT. If the value is greater than 0.05 then you have to ACCEPT the hypothesis that the
observed and expected values are the SAME.
Page 81
B io Factsheet
September 2000
It is worth noting the way that the critical values vary with sample size; with
a large sample, it is much easier to get a significant result!
The students will all collect their data at different times, and hence get
slightly different results. If they all carry out their statistical test at a 5%
significance level, then on average five of them would find themselves
rejecting the null hypothesis, because they happened to get "odd" data.
Number
Number77
74
Hypothesis Tests & Mann-Whitney U-test
The critical value is just the value we have to compare with the number we
have worked out - the test statistic - to decide whether or not we should reject
Ho. Each significance level - and each sample size - has its own critical value.
Critical values come from books of statistical tables.
Many biology projects involve a hypothesis test. Students often find difficulty in deciding on suitable hypotheses, and accordingly can waste time collecting
unhelpful data. This Factsheet explains what is involved in a statistical test of a hypothesis and discusses the role of the null hypothesis and the level
of significance. It also covers in detail the calculation and use of the Mann-Whitney U-test, which may be applicable to sets of biological data.
For the Mann-Whitney U-test, we reject Ho if our value is smaller. For
every test except Mann-Whitney, we reject Ho if our value is bigger
than the critical (tables) value.
Hypotheses
Bio Factsheet
Hypothesis Tests & Mann-Whitney U test
Table 2. Critical values for the U-test
n1 1
sample
5
6
7
8
The role of the statistical test is to give an objective definition of what
constitutes sufficient evidence to reject H0.
All statistical tests involve testing a null hypothesis (H0) against an
alternative hypothesis (H1 ).
Choosing Hypotheses
The null hypothesis can be described as "the boring case" - i.e. nothing has
changed. For example:
H0: There is no difference in species diversity in riffles and flats in a
particular river
Good hypotheses for a statistical test must:
• be specific, not vague or general
•
•
•
H0: There is no correlation between the yield of a particular crop and
the amount of water supplied
refer to something that can be measured in an unambiguous way
be simple and not attempt to include several variables
include a null hypothesis
Table 1. shows some examples of "bad" hypotheses, and how they can be
improved.
H0: There is no difference in the amount of lichen on the north and south
sides of a tree
Doing the test
The null hypothesis cannot ever be of the form "something is greater than
something else" or "something is related to something else". The exact form
of it depends on the test you are using (see September 1997, Bio Factsheet
No. 03 - Which Stats Test Should I Use?).
Whichever statistical test is used, we will effectively be plugging our data
values collected in the experiment into some formula, and coming out with
a single number. This number is what we will use to decide whether or not
to reject the null hypothesis.
The alternative hypothesis is effectively saying the opposite of the null
hypothesis - for example, for the last case above the alternative hypothesis
would be:
H1: There is a difference in the amount of lichen on the north and south
sides of a tree
To make that decision, we will have to compare this number we have worked
out - which is often called a test statistic - to the appropriate statistical table.
There are different tables for different statistical tests. Table 2 (overleaf)
shows an extract from a statistical table for the Mann-Whitney U-test.
Statistical tables give critical values at various significance levels.
When we carry out an investigation, we start off assuming that the null
hypothesis is true, and only change our minds if the data obtained in the
investigation provides strong enough evidence. This is rather analogous to
the situation in a court of law, where the defendant is assumed innocent
unless proven guilty!
The significance level is a measure of how strong we are requiring the evidence
to be before we reject H0. Common significance levels used are 10% (0.1),
5% (0.05) and 1% (0.01). A 1% significance level, for example, means that
we have only a 1% chance of rejecting Ho when we shouldn't have, whereas
a 10% level would give us a 1 in 10 chance of rejecting H0 when we should
have accepted it.
Obviously there will always be some chance variations - if, for example, the
areas of lichen on the north and south sides of the tree only differed by 0.1
mm2 ,we would probably feel - even without conducting any tests! - that
this was not a "significant" difference.
To get an idea of what this means, imagine 100 students carrying out the same
investigation into areas of lichen on the north and south sides of the same
species of tree.. We will imagine that there is really no significant difference
in area - in other words, the null hypothesis is true.
Table 1. Choosing a hypothesis
Original Hypothesis
What's wrong with it
Improved version
the more you smoke, the less fit
you are
• Not specific - how do you measure fitness?
• Need to eliminate other variables such as age
• There's no null hypothesis.
H0: there is no correlation between the number of
cigarettes smoked and recovery time.
H1: there is some correlation between the number of
cigarettes smoked and recovery time
.
H0: there is no correlation between lead levels
and distance from the road.
H1: there is some correlation between lead
levels and distance from the road.
the closer to the road, the higher
the pollution.
• Not specific - what sort of pollution?
• How will it be measured?
• There's no null hypothesis.
slope affects vegetation.
• Not specific - which aspect of the slope is referred to? Is
it the gradient, the altitude or the length of the slope?
• Not measurable - you cannot just measure "vegetation".
Should it be species diversity, or percentage cover, or
biomass, or incidence of a particular species?
• There's no null hypothesis.
1
H0: the gradient of the slope has no effect on
percentage cover.
H1: the gradient of the slope has some effect
on percentage cover.
α
10%
5%
10%
5%
10%
5%
10%
5%
n2 2
sample
5
6
7
8
4
2
-
5
3
7
5
-
6
5
8
6
11
8
-
8
6
10
8
13
10
15
13
Mann-Whitney U-test
We use the U-test to compare the median values of two sets of data - e.g. the species diversity on a path and
GLOSSARY
off a path. We are just trying to find out whether there is a difference - e.g. whether being on a path affects the Simpson's Diversity Index is
diversity.
calculated using the formula
The hypotheses to be tested are:
H0 : there is no difference between X and Y
Diversity = N (N-1)
H1 : there is a difference between X and Y
Σ
Σn(n-1)
(If you wish to be mathematically correct, you would use H0: median1 = median2; H1: median1≠ median2) We where n refers to the number of
will reject the null hypothesis if the value we calculate (the test statistic) is below the value from the tables individuals of each particular species
(the critical value).
and N is the total number of
The procedure for carrying out the test will be illustrated by applying it to data on species diversity on and individuals.
off a path.
APPLICATION
H0: There is no difference in species diversity on and off the path
METHOD
H
: There is a difference in species diversity on and off the path
1. Write down your hypotheses
1
2. Obtain data about the things you wish to compare - you need
two sets of data
Each set of data must contain at least 5 values, but they don't
have to have the same number of values as each other.
The figures for species diversity in 8 path sites and 8 off-path sites are:
Site
1
2
3
4
5
6
7
8
on-path 2.20 4.65 6.00 3.47 4.33 2.20 2.50
3.33
off-path 4.09 2.93 3.88 10.50 3.50 5.14 4.40 10.00
3. Consider one set of data at a time - say we start with the onpath sites.
We now must calculate a score for each on-path site.
A site is given:
1 for every off-path site that has a higher value
0.5 for every off path site that has an equal value.
Site 1, on-path: The following off-path sites have higher values:
1,2,3,4,5,6,7,8.
Hence the score is 8
Similarly, for the other on-path sites, the scores are:
Site
1
2
3
4
5
6
7
8
Score 8
3
2
7
4
8
8
7
4. Sum the scores of the on-path sites. This gives you the
overall on-path score
So the total on-path score is 47 (8+3+2+7+4+8+8+7)
5. Repeat steps 3) and 4) for off-path sites. This time, you will
be awarding points for on-path sites with higher or
equal scores.
We now find the scores for off-path sites:
Site 1, off-path : The following on-path sites have higher values: 2,3,5
Hence the score is 3
Similarly, for the other off-path sites the scores are
Site
1
2
3
4
5
6
7
8
Score 3
5
3
0
3
1
2
0
So the total off-path score is 17.
6. Take the smaller of the two scores. This is the U-value
So the U-value is 17
7. Compare your calculated U-value with the critical value in
the tables at the appropriate significance level. If your value is
smaller, reject H0 - otherwise accept H0.
The critical value for two samples of 8 is 13 at the 5% level of significance.
Since our value is 17, we accept H0 and conclude that there is no significant
difference between species diversity on and off the path.
Projects using the Mann-Whitney U-test
•
Find the recovery time after exercise (the time taken for the heart rate to return to normal) for at least five smokers and five non-smokers of
approximately the same age. Test whether or not there is a difference in recovery rates.
•
Measure the percentage vegetation cover at at least 5 sites on shaded and non-shaded slopes. Test whether or not there is a difference in % cover.
•
For a river with a source of pollution, find the species diversity at five or more sites above and five or more sites below the source of pollution. T.
Test whether or not there is a difference in species diversity.
Acknowledgements; This Factsheet was researched and written by Cath Brown. BioPress Factsheets may be copied free of charge by teaching staff or
students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in
any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136
2
Page 82
B io Factsheet
January 1999
Species Diversity
Box 1. Calculating Diversity
Number 34
Species Diversity
Species
New species discovered
annually (as a % of
those already known)
Measuring species diversity
The simplest way of measuring diversity is to count the number of different
species. A garden on the outskirts of a small village might be visited by
approximately thirty different species of birds; a city garden will probably
have many fewer. So we can count the number of species present under
standard conditions and produce a quantitative measure of diversity.
There is a problem here, however. Look at the example in Table 1.
Proportion of
species described
Birds
0.8
High
Reptiles
1.17
High
Platyhelminthes
1.58
Moderate
Fungi
2.43
Very low
Table 1. Number of different species of plant found in two areas
Species
A
B
C
D
E
What seems more certain is that, in terms of number of species, insects
rule the planet! (Fig 1).
Fig 1. Possible proportions of total species
A
C
B
D
F
Total number of plants in
Quadrat X
Quadrat Y
95
2
1
3
1
18
23
27
14
20
This shows the plants found in two quadrats in some sand-dunes on the
Welsh coast. There are five species in each but common sense suggests that
the overall diversity is very different. Nearly all the plants in quadrat X
belong to species A. In quadrat Y all five species are present in large
numbers. Quadrat Y seems to have greater diversity than quadrat X. What
we need is a way of calculating diversity which takes into consideration the
number of individuals as well as the number of species. There are many
different ways of doing this but one of the simplest is to calculate an index
of diversity using the formula shown in Box 1 (overleaf).
E
G
H
I
Why is diversity important?
K
KEY
A Vertebrates 0.4%
B Algae 1.6%
C Protozoans 1.6%
D Moluscs 1.6%
E Plants 3.0%
J
F
G
H
I
J
K
Bacteria 3.0%
Viruses 4.0%
Nematodes 4.0%
Arachnids 6.0%
Fungi 8%
Insects 65.0%
Exam Hint - As a general rule, the greater the species diversity in a
particular ecosystem, the more stable it is.
Species
The index of diversity (d) is calculated from the following formula
d = N(N-1) where N = total number of organisms of all species
Σn(n-1) & n = Total number of organisms of a particular species
Dandelion
Oxford ragwort
Common sowthistle
Buddleia
Mugwort
Number of plants of this species
in study area
7
28
1
2
5
d=
43 x 42
(7 x 6) + (28 x 27) + (1 x 0) + (2 x 1) + (5 x 4)
d=
1806
42 + 756 + 0 + 2 + 20
d = 1806 = 2.2
820
On its own, this figure of 2.2 tells you very little. It does, however allow
you to compare the diversity of plants growing between the railway
lines with the diversity of plants growing in other areas. In this particular
case, the diversity was much lower than on a disused piece of rail track
nearby.
Diversity and Succession
Succession is the ecological process in which the different species of
organisms in a community are gradually replaced by others over a period
of time. Sand dunes are found in many areas around the coast. Near the
sea, abiotic conditions are harsh. The sand is blown by the wind and is
unstable. It contains little humus and therefore dries out very rapidly.
There are also low levels of soil nutrients such as nitrates. One of the few
plants able to survive in these conditions is marram grass. In time, the
roots of the marram grass bind the sand particles together, Plants die and
decompose, increasing the humus in the soil. Marram grass can no longer
survive and it is replaced by other species of plant. It is the basic principle
that we applied earlier. As succession takes place, abiotic conditions
become less severe. More species occur and there is a higher diversity.
This is summarised in Fig 2.
Fig 3. A rocky sea shore
Upper shore:
Harsh environment where, over the course of a tidal cycle,
the temperature and salinity can vary enormously.
Organisms are completely covered in water at high tide
while, at low tide, they are exposed to the drying effect of
the air.
Fig 2. Sand dune succession
Early stages in succession
This is a very hostile
environment. It is very
exposed and sand grains are
continually being blown by the
wind. There is little humus in
the soil and water is not
retained very well. Few species
can survive in these conditions.
Later stages in succession
Conditions are far more suitable
for plant growth. Plant cover
means that the sand is no longer
being blown away. The humus
in the soil resulting from dead
plant remains enables it to hold
water much better. More species
grow here.
In order to interpret information about diversity we need to understand a
very important principle. The distribution of living organisms is influenced
by abiotic factors such as the amount of rainfall, soil pH, temperature and
so on. The more extreme these abiotic conditions, the fewer the species
that can survive and, therefore, the lower the diversity of organisms found
there.
Sea
We will use this principle to compare the diversity of living organisms
found in the Arctic with the diversity of those found in tropical rain
forests. In the Arctic, over the long winter period, temperatures rarely rise
above freezing and, as a consequence, water remains biologically unavailable,
frozen solid as ice. The Arctic winter is not only cold but dark, for several
months the sun barely shows above the horizon. These are clearly extremely
harsh abiotic conditions. Not surprisingly, relatively few species are adapted
to survive an arctic winter. Arctic ecosystems therefore tend to have low
diversities.
Result: Increasing diversity as succession progresses
However, within a tropical rain forest, there is water in abundance and
temperatures are high throughout the year. Many organisms can survive in
these conditions and the species diversity in such places can be very high.
1
Table 2 shows the number of plants of different species growing between
the railway lines in a station.
Table 2.
Species diversity is a very important ecological idea. It can be expressed mathematically and describes the number of individuals and the
number of species in a community. It is estimated that there are 13-14 million different species on Earth. Humans have recorded about
2 million of these; in other words we simply know nothing about large parts of the animal and plant kingdom (Table 1). The ecosystems
with greatest species diversity are tropical rainforests, coral reefs and large tropical lakes.
Table 1. Proportion of species discovered annually
Bio Factsheet
Lower shore:
Organisms are almost always under water.
Temperature and salinity fluctuate very little.
Dehydration is not a problem.
Inland
Result: Many more species of marine organisms can live on the lower
shore. Therefore diversity increases down the shore.
Diversity and Pollution
Human activity frequently leads to pollution of the environment. Pollution
results in harsher environmental conditions, so the more pollution, the
lower the diversity of organisms.
Diversity and zonation
Biologists can make use of this idea to monitor pollution levels. A number
of different indices of diversity have been designed to assess water quality.
They take into account the fact that some groups of animals are much more
sensitive to pollution than others. Each group that is present at a particular
site is given a value, the values are added together and a figure is obtained
which provides an idea of the amount of organic pollution at the site.
Abiotic conditions often vary within an ecosystem. Around the high tide
level on a rocky seashore, for example, abiotic conditions are extreme.
Organisms that live there must be able to withstand considerable fluctuations
in temperature and salinity. On the lower shore, abiotic conditions show
much less variation. Organisms are covered by sea water for much of the
day. Temperature and salinity will vary little. Applying our general principle
again, as we go down the shore, abiotic conditions become much less severe
(Fig 3).
Acknowledgements; This Factsheet was researched and written by Bill Indge
Curriculum Press, Unit 305B, The Big Peg, 120 Vyse Street, Birmingham. B18 6NF
Bio Factsheets may be copied free of charge by teaching staff or students, provided that their school is a
registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted,
in any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136
2
Page 83
% LR )DFWVKHHW
www.curriculumpress.co.uk
www.curriculum-press.co.uk
%LR)DFWVKHHW
144 Spearman's Rank Correlation Coefficient
Number 144
Spearman's Rank Correlation Coefficient
Spearman's Rank is one method of measuring the correlation between two variables. Correlation may be:
• positive (large values of one variable associated with large values of the other variable - eg nitrate concentration and plant growth)
• negative (large values of one variable associated with small values of the other - eg soil salinity and plant growth )
Correlation is measured on a scale from -1 to 1
www.curriculum-press.co.uk
Ranking
Ranking is similar (though not identical) to awarding places in a race. When doing the ranking, it does not matter whether you give the rank "1" to the
largest value, or to the smallest value - provided you are consistent.
If there are no ties, you just give out the ranks in the obvious way, starting at 1 and carrying on to however many pieces of data you have.
If there are ties, you have to be a bit careful:
For example, suppose three pieces of data tie for 4th place.
Normally, if there hadn't been any ties, you'd expect the next three pieces of data to "use up" the ranks 4, 5, 6
So we give all three pieces the average of 4, 5 and 6 - that's 5.
The next piece of data then has rank 7 (as ranks 4, 5 and 6 have been "used up")
Worked Example
The data below were collected on soil salinity and plant height.
Soil Salinity
28 12 15 16 2 5
Plant height (mm) 10 40 40 52 75 48
-1
Perfect negative
rank correlation
0
No correlation
1
Perfect positive
rank correlation
Which correlation coefficient?
Hypotheses
There are three correlation coefficients in common use; Spearman's is used
most often (and hence is the principal subject of this Factsheet), but there
are cases when the other coefficients should be conisdered:
As with any other statistical test, you are using the test to decide between
two hypotheses: -
Spearman's Rank Correlation Coefficient
• Can be used for any data that you can put in order smallest to largest
• Measures whether data are in the same order - eg does highest nitrate
concentration coincide with highest plant growth - rather than using
actual data values
• Not valid if there are a lot of ties (eg several pairs of samples having the
same pollution level), although one or two ties is OK.
• Easy to calculate for small data sets, but unwieldy for large data sets.
NO
Do the data look close to
a straight line?
RANK
CORRELATION
The alternative hypothesis can take three possible forms:
a) H1: there is some correlation between X and Y
b) H1: there is positive correlation between X and Y
PEARSON'S
SPEARMAN'S
We'll give rank 1 to the highest values for each:
Soil Salinity
Rank
Step 3: Work out "d" and "d2", where d stands for
the differences between pairs of ranks
Note: you must square each d individually
Step 4: Substitute into the formula
or
If you have a good scientific reason in advance (before actually getting any
results) for expecting a particular type of correlation, then choose b) or c).
If you do not have a reason for expecting a particular type, use a). If in
doubt - use a)
Alternative hypotheses b) and c) above are referrred to as directional because they specify a particular "direction" of correlation. Alternative a)
is non-directional. When you are doing the actual statistical test, you
need to be aware that a non-directional alternative requires you to do a
2-tailed test, but a directional alternative requires a 1-tailed test - further
details are given in the worked example overleaf.
Sample Size
The absolute minimum number of values for using Spearman's Rank is
4 - but it is very hard to get a significant result using this few! It's best
to use at least 7 - and if you can get up to about 15, better still. Very
large sample sizes (50+) can make it hard to handle the calculations, and
many Spearman's tables do not go up this high.
YES
KENDALL'S
1
28 12 15 16
1 4 3
2
2
6
5
5
Plant height (mm) 10 40 40 52 75 48
Rank
6 4.5 4.5 2
1 3
or
c) H1: there is negative correlation between X and Y
Are there a lot of ties?
NO
the alternative hypothesis (H1) - which is what you hope to get
evidence for.
Exam Hint: - Only the alternative hypothesis can be directional the null hypothesis is never directional.
YES
YES
•
Step 2: Work out the two sets of ranks, taking
care to allow for ties.
H0: there is no correlation between X and Y
Pearson's Product Moment Correlation Coefficient
• Can only be used for continuous data (eg lengths, weights)
• Uses the actual data, not just their ranks
• Measures how close to a straight line the data are - check on a scatter
graph that the data do approximate a straight line rather than a curve.
• Can be easier to get significant results than using rank correlation
• A nuisance to calculate by hand, but can be calculated automatically on
many graphic calculators and using a spreadsheet
• If you are unsure whether it is valid, it's better to use rank correlation
The flowchart shows how to choose your correlation coefficient.
NO
the null hypothesis (H0) - which is what you assume, until you get
convincing evidence otherwise.
H0: There is no correlation between soil
salinity and plant height
H1: There is negative correlation between
soil salinity and plant height
For any test of correlation, your null hypothesis is always:
Kendall's Rank Correlation Coefficient
• Like Spearman's, uses the ranks of the data rather than the actual data,
and can be used for any data that can be ordered.
• A good substitute for Spearman's if there are a lot of ties
• More of a nuisance to calculate than Spearman's
Is the data continuous?
(eg lengths, weights etc)
•
7KLVLVDVHQVLEOHFKRLFH
SURYLGHGZHNQRZWKH
SODQWLVQRWDKDORSK\WH
Step 1: Write down the hypotheses
rs =
1-
6Σd 2
(n3 - n)
Soil salinity rank
Plant height rank
d
d2
1
6
5
25
4
4.5
0.5
0.25
3
4.5
1.5
2.25
2 6
2 1
0 5
0 25
5
3
2
4
Σd 2 = 25 + 0.25 + 2.25 + 0 + 25 + 4 = 56.5
rs =
1-
6 × 56.5 =
1
(63 - 6)
-
7KHWZR´µYDOXHVWLH
7KH\·GQRUPDOO\KDYHXVHGXS
WKDQGWKSODFH²VRJLYH
WKHPERWKWKHDYHUDJHRI
DQG²WKDW·V
7KHQH[WRQHZLOOKDYHUDQN
DVUDQNVDQGKDYHEHHQ
XVHGXS
n=6
339 = -0.6142
210
rs = Spearman's Rank Correlation Coefficient
Σd 2= sum of the d2 values
n = number of pairs of values in sample
Step 5: Get a Spearman's table and look up the
critical value for the appropriate
significance level (usually 5% = 0.05),
sample size and 1-tailed or 2-tailed test.
We have n = 6, and we are doing
a one-tailed test, because of the
form of H1.
So critical value is 0.771
Step 6: Make a decision - if your calculated
chi-squared value is bigger than the
critical value (ignoring signs), you can
reject the null hypothesis.
Otherwise you must accept it.
1-tail
2-tail
n
4
5
6
7
0.1
0.2
0.05
0.10
0.025
0.05
0.01
0.02
0.005
0.01
1.000
0.700
0.657
0.571
1.000
0.900
0.771
0.679
1.000
0.900
0.829
0.786
1.000
1.000
0.943
0.857
1.000
1.000
0.943
0.893
Our value (-0.6142) is smaller than the critical value (ignoring signs)
So we must accept the null hypothesis - there is no correlation between soil salinity and
plant growth at the 5% significance level.
Further Investigations Using This Test
• Relationship between concentration of fungicide and zone of
inhibition for a particular fungus
• Relationship between molecular size and rate of metabolism in yeast
• Relationship between algal growth and nitrate concentration
• Relationship between blackspot disease in roses and traffic levels
•
•
•
•
Relationship between mass of leaf buried and earthworm mass
Relationship between pest density and yield for broad beans
Relationship between body mass and running ability for house
spider
Relationship between pH of soil and pH of leaf litter
Acknowledgements: This Bio Factsheet was researched and written by Cath Brown. Curriculum Press. Bank House, 105 King Street, Wellington, TF1 1NU. Geopress Factsheets may be copied free of charge by teaching staff
or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission
of the publisher. ISSN 1351-5136
2
Page 84
% LR )DFWVKHHW
January 2003
Worked Example
In an investigation into the habitats preferred by light and dark- shelled snails, the following results were obtained:
The Chi-SquaredTest for Association
The chi-squared test for association (or association index or chi-squared contingency tables) is commonly used in projects to measure whether
two factors are associated (e.g. whether greater numbers of a certain plant species occur in areas with high rainfall). This Factsheet explains
how to use this test.
When do I use this test?
How does it work?
This test is one way of examining whether two variables are related - for
example, "Is the colour of hydrangea flowers related to the type of soil?"
• The null hypothesis will be that the two variables are not associated
or are independent. (This means that they do not affect each other)
• The alternative hypothesis will be that the two variables are
associated or are not independent.
To see how this test works, let's look at hydrangea flowers in different soil
types.
The flowers can be pink or blue, and we'll look at 50 plants grown in clay
soil and 50 plants grown in sandy soil.
Suppose the results were:
So, in the above example, the hypotheses would be:
H0: Hydrangea flower colour is independent of soil type
H1: Hydrangea flower colour is not independent of soil type
pink
25
25
sand
clay
Limestone pavement
Limestone woodland
Number 120
www.curriculumpress.co.uk
blue
25
25
Dark
85
111
H0: Shell colour is independent of habitat preference
H1: Shell colour is not independent of habitat preference
Light
Dark
121
85
74
111
121 + 74 = 195 85 + 111 = 196
column totals
Step 2: Work out the row, column and overall
totals for the original data
Limestone pavement
Limestone woodland
Total
Step 3: Calculate the expected frequencies for
each category using the formula
row total × column total
overall total
Limestone pavement
Limestone woodland
Step 4: For each of your categories, work out
eg for Light Pavement
(121 - 102.7 )2
Light
Pavement 3.26
Woodland 3.63
Dark
3.24
3.61
From these results, we would probably conclude that there was no link
between flower colour and soil type - flower colour and soil type were
independent.
The test is used to compare observed frequencies (what is produced
from the investigation) with expected frequencies (what you'd expect
from the null hypothesis).
Light
121
74
Step 1: Write down the hypotheses
( O - E)
It is important to note that the alternative hypothesis does not tell you
how the two variables are related - it could be that pink flowers occur in
sandy soil and blue in clay soil, or vice versa.
%LR)DFWVKHHW
The Chi-Squared Test for Association
2
E
(O = observed values, from the experiment
E = expected values, from step 3)
Light
206 × 195/391 = 102.7
185 × 195/391 = 92.3
102.7
Total
85 + 121 = 206
row totals
74 + 111 = 185
206 + 185 (or 195 + 196) = 391
overall total
Dark
206 × 196/391 = 103.3
196 × 185/391 = 92.7
= 3.26
If, instead, the results were:
sand
clay
Any chi-squared test can only be used with frequencies (that is, numbers
of items in particular categories), and all expected frequencies must be
at least 5.
pink
50
0
blue
0
50
We'd conclude there was a link between flower colour and soil type.
The easiest way to guarantee this is to make sure there are at least 5 items
in each category, but if this really isn't possible, you can usually get away
with one category having less than 5 if all the others have substantially
more. If your expected frequencies turn out to be less than 5, you should
collect more data and redo the test.
But what if the results were:
sand
clay
pink
30
20
blue
22
28
How would we decide whether there was "enough" difference, to decide
that flower colour and soil type were linked?
Step 5: Add up all these values.
This gives the chi-squared value
chi-squared value = 3.26 + 3.24 + 3.63 + 3.61 = 13.74
Step 6: Work out the degrees of freedom using
the formula:
(rows - 1)(columns - 1)
Degrees of freedom = (2 - 1)×(2 - 1) = 1
Tables value for 1 df and 5% significance is 3.84
Step 7: Get a chi-squared table and look up the
value for the appropriate significance level
df .10
(usually 5%) and the degrees of freedom.
1
2
3
4
The chi-squared test for association gives us a way of deciding what
constitutes "enough" difference.
Table 1. Investigations using chi-squared test for association
Investigation
Whether colour of hydrangea flowers is
affected by soil type
Null Hypothesis
Colour of flowers is independent of
soil type
What to Measure
Choose two areas of differing soil types, and note the number
of hydrangea plants in each area of each colour.
Whether a particular type of caterpillar feeds
preferentially on a particular type of plant
Caterpillar species' feeding is
independent of the type of plant
In a specific area, note the number of plants of the type
required with any of the caterpillars on them, the number of
such plants without caterpillars, the numbers of other plants
with caterpillars and without caterpillars.
Whether snails of the same species with
different coloured shells prefer different
habitats
Shell colour is independent of habitat
preference.
Using the same area quadrat in each habitat, note the number
of snails of each type found.
Whether dandelions and plantains grow
preferentially together.
Incidence of dandelions is
independent of incidence of plantains
Using a standard size quadrat, note the number of quadrats
in which both species are found, in which neither are found,
and in which just dandelions or just plantains are found.
1
2.71
4.61
6.25
7.78
.05
3.84
5.99
7.81
9.49
.025
5.02
7.38
9.35
11.14
.01
6.63
9.21
11.34
13.23
.005
7.88
10.60
12.84
14.86
Step 8: Make a decision - if your chi-squared
value is bigger than the one from the
tables, you can reject the null hypothesis.
Otherwise you have to accept it.
Our value is larger than the tables value, so we reject the null hypothesis.
Step 9: Write down your conclusion.
At the 5% level of significance, we can conclude that shell colour and habitat preference are
not independent.
As with any other statistical test, the results are only as reliable as the original data. In evaluating this sort of investigation, you should consider:
• whether there are any other factors, other than those measured, which might affect the numbers of each type of snail
•
•
•
whether you sampled enough areas in each category
whether your sampling method could have introduced any kind of bias
whether the areas sampled are "typical" of limestone woodland and pavement.
Acknowledgements: This Geo Factsheet was researched and written by Cath Brown. Curriculum Press. Unit 305B, The Big Peg, 120 Vyse Street, Birmingham, B18 6NF Biopress Factsheets may be copied free of charge by teaching
staff or students, provided that their school is a registered subscriber. No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission
of the publisher. ISSN 1351-5136
2
Page 85
B io Factsheet
January 2001
Worked Example
Number 79
The Chi-SquaredTest for Goodness of Fit
A student decides to investigate whether pollution levels affect the incidence of asthma. They obtain a sample of 50 sixth formers from their own
school, which is situated in a large, polluted city and 50 sixth formers from another school, which is situated in a small, relatively unpolluted town.
Each sixth former is asked whether or not s/he suffers from asthma.
The student obtains the following results:
City: 16 with asthma
The purpose of all statistical tests is to choose between two hypotheses - for example: "the leaves are smaller at the top of the tree than at the bottom"
and "the leaves are not smaller at the top of the tree than at the bottom". The null hypothesis (H0) always has to be the "boring case" - that there's no difference
between things. In the above example, it would be "the leaves are not smaller at the top of the tree" - or equivalently, "the leaves are the same size at the
top and bottom of the tree". It can never be "the leaves are bigger at the top of the tree". The other hypothesis is called the alternative hypothesis (H1).
In our leaves example, it would be "the leaves are smaller at the top of the tree".
Step 3: Put your observed frequencies (from the actual
experiment) and the expected frequencies (from
step 2) in a table.
When we carry out a test, we always start out assuming that the null hypothesis is true, and we only change our minds if we have enough evidence - it's
like a trial, when you are assumed to be innocent unless there's enough evidence to show that you aren't! Table 1 shows some possible investigations using
chi-squared and the corresponding null hypotheses.
Table 1. Investigations using chi-squared
Step 4: For each of your categories, work out
Null Hypothesis (H0)
What to Measure
How the concentration of nitrate in
water affects seed germination.
The concentration of nitrate in water has no
effect on seed germination
How a particular type of pollution
affects a particular organism
The level of pollution has no effect on the
numbers of the organism
The number of seeds germinating within a set of time
period when watered with solutions containing
different concentrations of nitrate.
The numbers of the organism obtained from a set area
in two sites which are similar, except for one being
polluted and one not polluted
Are the predictions of genetics
accurate?
The numbers of organisms in each category are The numbers of organisms in each category
in accordance with the predictions of genetics
What is chi-squared?
Exam Hint: - Marks are only awarded for an appropriate use of statistics.
Decide exactly what your hypotheses are and what test you are going
to use before collecting your data.
Chi-squared goodness of fit is used to test whether the actual results of an
experiment fit in with what we'd expect if the null hypothesis were true. To
see how this works, imagine testing a normal coin to see whether heads and
tails were equally likely. Our null hypothesis is that they are equally likely
- we have no reason to believe otherwise before we do an experiment, and
we always choose "the boring case" for the null hypothesis. If we tossed
our coin 600 times, we'd expect to get about half of each - around 300 heads
and 300 tails.
( O - E) 2
E
City
Observed (O) 16
Expected (E) 12
(
)
For the city: 16 - 12
12
If we then actually tossed the coin 600 times and got over 500 heads, we'd
feel that this was a long way off from our predictions - so we'd probably
decide that the coin was weighted. However, if we got 305 heads and 295
tails, we'd probably feel this was close enough, and decide the coin was OK.
The chi-squared test lets us decide on an accurate basis what counts as "close
enough".
In order that the test be valid, it is also important that you should expect at
least five items in each category. Sometimes it may be necessary to
combine categories in order to achieve this - for example, if you were
researching public attitudes by using a questionnaire, you might need to
combine the responses "not very concerned" and "not at all concerned".
Obviously, we can never be absolutely certain that our decision is correct
- we could get 500 heads by chance even if the coin wasn't weighted. We
can decide how far off the results have to be by carrying out the test at
different significance levels. The smaller the significance level we use, the
"further off" the results need to be for us to reject the null hypothesis. Using
a smaller significance level is like requiring the evidence to be more
convincing. Statistical tests in biology are usually carried out at the 5%
significance level.
Exam Hint: - Do not try to be too ambitious in your investigation.
Testing one simple, easily measurable hypothesis involving only one
variable successfully will gain more marks than an attempt at
investigating a situation affected by many variables.
8
12
Exam Hint: - Don't worry if your expected
frequencies are not whole numbers.
They don't have to be! Do not round
them to the nearest whole number - this
will make your test less accurate.
2
2
= 1.33
For the town:
( 8 - 12 )
= 1.33
12
chi-squared value = 1.33 + 1.33 = 2.66
Step 6: Work out the degrees of freedom.
This is one less than the number of categories
Degrees of freedom = 2 − 1 = 1
Step 7: Get a chi-squared table and look up the value
for the appropriate significance level (usually
5%) and the degrees of freedom. In the table 5%
is shown as 0.05.
We look for the 5% level for one degree of freedom.
This is 3.84 (Table 2)
Table 2. Chi-squared tables
Step 8: Make a decision - if your chi-squared value is
bigger than the one from the tables, you can
reject the null hypothesis. Otherwise you have
to accept it.
Our value is smaller than the value from the tables, so we accept the null
hypothesis - there is no significant difference in the amount of asthma between
the city and the town.
df
1
2
3
4
Exam Hint: - Don't worry about what
degrees of freedom means! Unless
you want to study Statistics as a
subject, you don't need to know!
0.10
2.71
4.61
6.25
7.78
0.05
3.84
5.99
7.81
9.49
0.025
5.02
7.38
9.35
11.14
0.01
6.63
9.21
11.34
13.23
0.005
7.88
10.60
12.84
14.86
Points to note
2. This test is not telling us that the null hypothesis is definitely correct
- it is telling us that we haven't got enough evidence to reject it.
1. This investigation needs care in sampling technique!
• The sixth formers need to live in the area, not just go to school
in it
• They may not have lived there for long
•
•
•
1
Town
Step 5: Add up all these values.
This gives the chi-squared value
When can chi-squared be used?
You can only use chi-squared when you have frequencies - that is, numbers
of items in particular categories. You cannot use it to compare measurements
or other figures directly. For example, if you were doing an experiment on
seed germination, you could use chi-squared to compare the numbers of
seeds germinating within a week in each of four different solutions. However,
you could not use it to compare the heights of four different seedlings.
Pollution has no effect on incidence of asthma
Pollution has some effect on the incidence of
asthma
We do this by adding up all the people with asthma, and dividing by the number
of different categories we're looking at - which is two (city and town).
÷ 2 = 12 in each area
So the expected frequencies are (16 + 8)÷
Step 2: Work out the expected frequencies
Hypotheses
Town: 8 with asthma
Hypotheses are:
H0 (null hypothesis):
H1 (alternative hypothesis):
Step 1: Write down the hypotheses
χ 2) test is widely used within project work. This Factsheet will tell you when and how to use it. Questions using the chiThe chi-squared (χ
squared test may occur on exam papers, but you will not have to remember the formula - it will be given to you.
Investigation
Bio Factsheet
The Chi-SquaredTest for Goodness of Fit
3. To improve the chance of getting a significant result - in other words,
rejecting the null hypothesis - a larger sample usually helps! For
example, if the student had taken a sample of 100 from each school
and found the city and town had 32 and 16 respectively with asthma,
s/he would have been able to reject the null hypothesis (check this
calculation!)
School sixth formers may not be representative of the population
as a whole
How does the student know what the pollution levels are - are
they just assuming it?
Pollution levels are not the same throughout a city or town.
2
Page 86
Bio Factsheet
The Chi-SquaredTest for Goodness of Fit
Worked Example
A student decides to investigate the results of crossing plants with red and yellow flowers. The student knows that the allele for red flowers is dominant.
S/he carries out the cross, and obtains 18 plants, all of which have red flowers.
1
a) Give the genotype of the 18 red-flowered plants produced.
b) Explain what, if anything, can be deduced about the genotype of the parent red-flowered plant.
2
The student then crosses two of the offspring red-flowered plants and obtains 17 red-flowered plants and 3 yellow-flowered plants.
c) i) Find the ratio of red-flowered to yellow-flowered plants that would be expected.
ii) Carry out a chi-squared test at the 5% significance level to determine whether the results are in accordance with your predictions.
2
10
Answer and mark Scheme
a) It must be Rr (since the yellow parent would be rr, and any rr offspring would be yellow);
1
b) Probably RR; since no yellow-flowered offspring are produced, (but this is not certain);
2
c) i) Rr crossed with Rr produces RR, Rr, rR and rr in equal proportion; Since the first three all are red-flowered, we would expect
red-flowered:yellow-flowered to by 3:1;
ii)
Hypotheses are:
Step 1: Write down the hypotheses
H 0:
Results obtained are not significantly different from 3:1 ratio
H 1:
Results obtained are significantly different from 3:1 ratio;
1
We do this using
Step 2: Work out the expected frequencies
Total no. of
individuals
Expected no. of
=
individuals in a category
So expected for red
Expected for yellow
Step 3: Put your observed frequencies (from the actual
experiment) and the expected frequencies (from
step 2) in a table.
(
)
Step 4: For each of your categories, work out O - E
E
2
Red
Ratio number for
that category
Total of all the numbers in the ratio
=
20 × 3
= 15 ;
1+3
=
20 × 1
=5;
1+3
Exam Hint: - Check that your
expected frequencies add up
to the total number of
individuals - 15 + 5 = 20
2
Yellow
O 17
E 15
3
5
(
)
2
×
2
Red: 17 − 15 = 0.2667 ;
15
1
2
Yellow: ( 3 − 5) = 0.8 ;
5
2
Step 5: Add up all these values.
This gives the chi-squared value
chi-squared value = 0.2667 + 0.8 =1.0667;
Step 6: Work out the degrees of freedom.
This is one less than the number of categories
Degrees of freedom = 2 − 1 = 1;
1
Step 7: Get a chi-squared table and look up the value
for the appropriate significance level (usually
5%) and the degrees of freedom.
We look for the 5% level for one degree of freedom;
This is 3.84 (Table 2 overleaf)
1
Step 8: Make a decision - if your chi-squared value is bigger
than the one from the tables, you can reject the null
hypothesis. Otherwise you have to accept it.
Our value is smaller than the value from the tables, so we accept the
null hypothesis - the results are not significantly different from the
3:1 ratio;
1
1
Total 15
Acknowledgments: This Factsheet was researched and written by Cath Brown. Curriculum Press, Unit 305B The Big Peg, 120 Vyse Street, Birmingham B18 6NF
Bio Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber. No part of these Factsheets may
be reproduced, stored in a retrieval system, or transmitted, in any other form or by any other means, without the prior permission of the publisher. ISSN 13515136
3
Download