Scientific Method -

So, like, how DO we do Science?
IB Biology HL
The Scientific Method
• Has several stages:
– Observation of a puzzling phenomenon
– Making a guess as to the explanation
– Devising a way to test whether or not the
explanation is accurate
– Performing the test(s) and using the results to
determine whether the explanation is accurate or
The Dog That Understands French
• Mr Smith has taught his dog Rover to understand
French. Mr Smith noticed that every evening,
after dinner, when he went to the door with his
coat on and said "Walk", Rover immediately
understood and came running. Mr Smith was
going to France for the summer, and, as an
experiment in international understanding,
decided to teach Rover French. He started to say
"Allons" instead of "Walk". To his delight, Rover
very quickly understood and came running.
What is the observation?
The dog apparently responds to the call of ‘allons’
What is Mr Smith’s hypothesis?
The dog understands the word as meaning ‘walk’
Is this the only explanation?
No. The dog may respond to a total situation (after
dinner, going to the door, coat on, call) of which what is
actually called is only a small part. A change in the call
may not matter much to the dog.
• Can we devise a test to discriminate between possible
Possible Tests
1) Call "Allons" to the dog in a different
situation: for example, in the morning when
he does not usually go for a walk.
2) Go to the door, in the normal way (coat on,
after dinner) without calling anything.
3) Do likewise and call something silly like
• Notice that these tests do not tell us anything
of a dog's ability to learn French words. They
are only concerned with the specific case of
responding to one French word. We will see
later that extrapolating from the specific to
the general is very important in scientific
Experimental Design
• Experiments should discriminate between
different hypotheses
• Usually experiments need to be repeated several
times for the results to be analyzed statistically
• Scientists must be careful in generalizing results
to others or a population
• Experiments must also be well controlled
• Scientists must take into account the accuracy
and precision of their measurement instruments
Let’s Practice
• Experiment: Do plants give off water vapour?
Forty bean plants, growing in pots, were covered one
afternoon by individual glass containers and left in
the laboratory overnight. Next morning, the inside of
the lid of each container was found to be covered in
droplets of a fluid that proved to be water.
Conclusion: Plants generally give off water vapour.
1. Lack of Controls
The water could have come from the plants, the soil, the pots, or the air in
the jar. Control experiments should have been set up to test for these
2. The conclusion contains some points that are not valid
(a) The experiment was done overnight and so can tell us
nothing about the behaviour of the plants at other times
of day; the word 'generally' is not justified.
(b) It was carried out with an adequate number of bean
plants but can tell us nothing about other kinds of plants;
the word 'plants' should be qualified.
(c) There is no evidence in the experiment that water is
given off as a vapour.
Designing Experiments to Use Statistics
• Good experimental design is the key to good
• In many cases, good experimental design
involves having a clear idea about how we will
analyze the results when we get them. That's
why statisticians often tell us to think about
the statistical tests we will use before we start
an experiment.
Three important steps in
Experimental Design
1. Define the objectives. Record (i.e. write down)
precisely what you want to test in an experiment.
2. Devise a strategy. Record precisely how you can
achieve the objective. This includes thinking about
the size and structure of the experiment - how many
treatments? how many replicates? how will the
results be analyzed?
3. Set down all the operational details. How will the
experiment be performed in practice? In what order
will things be done? Should the treatments be
randomized or follow a set structure? Can the
experiment be done in a day?
Common Statistical Terms
• Suppose that we are measuring the size of cells. The thing that we
are measuring or recording (e.g. cell size) is called a variable.
• Each measurement that we record (e.g. the size of each cell) is a
value or an observation.
• We obtain a number of values (e.g. 100 for cells), and this is our
• The sample (e.g. 100 cells) is part of a population. In this case the
population (in biological terms) is all the cells in the culture.
Theoretically, we could measure every cell to get a precise measure
of that population. But often we want to be able to say more than
this - something of general significance, based on our sample. For
example, that if anyone were to measure the cells of that organism,
then they would find a certain average value and a certain range of
Common Statistical Measures
• The average, or mean
• Some measure of the dispersion (range of
variation) of data around the sample mean.
For this we use the variance and thence the
standard deviation
• Having obtained those values, we use them to
estimate the population mean and the
population variance.
A Sample
• In the following sections we will start from a
small sample, describe it in statistical terms,
and then use it to derive estimates of a
• Here are some values of a variable: 120, 135, 160, 150.
• We will assume that they are measurements of the
diameter of 4 cells, but they could be the mass of 4
cultures, the lethal dose of a drug in 4 experiments
with different batches of experimental animals, the
heights of 4 plants, or anything else. Each value is a
replicate - a repeat of a measurement of the variable.
• In statistical terms, these data represent our sample.
We want to summarize these data in the most
meaningful way. So, we need to state:
– the mean
– the number of measurements (n) that it was based on
– and a measure of the variability of the data about the
mean (which we express as the standard deviation)
Why divide by n-1?
• Dividing by n gives an underestimate of the true
underlying population variance. The sample of 4
is small, and hence misleading.
• In general, this biased value should not be used.
• We divide by one less than the number of values
to make the standard deviation bigger; in doing
so, we have taken into account the fact that there
could be other values of the population that are
not within our sample range, and hence our
estimate of the mean is more accurate.
• n-1 are the degrees of freedom
Standard Deviation vs. Standard Error
• We calculate the sample mean so that we can use it as
an estimate of the mean for the whole population.
• The sample mean will vary from sample to sample; the
way this variation occurs is described by the "sampling
distribution" of the mean.
• We can estimate how much sample means will vary
from the standard deviation of this sampling
distribution, which we call the standard error (SE) of
the estimate of the mean. Another way of considering
the standard error is as a measure of the precision of
the sample mean.
Standard Error
• The standard error (se) balances the dispersion
associated with the underlying population and
the error associated with the sampling process.
• We can think of the standard error as measuring
how precisely we have estimated the population
mean, or another parameter, via the sample
mean, or another statistic.
• As the sample size gets bigger and bigger, the
standard error will shrink, reflecting the fact that
our estimate for the mean, or another statistic,
will become more and more precise.
So which do we use?
• If we want to say how widely scattered some
measurements are, we use the standard
deviation. If we want to indicate the
uncertainty around the estimate of the mean
measurement, we quote the standard error of
the mean.
Confidence Intervals
• The standard error is most useful as a means
of calculating a confidence interval. For a large
sample, a 95% confidence interval is obtained
as the values 1.96xSE either side of the mean.
Let’s practice
• Copy down the following two data sets:
a)46, 42, 44, 45, 43
b)52, 80, 22, 30, 36
• Using the graphing calculator, calculate the
sample mean and standard deviation for each
• What do you notice?
Another Example
• Assume that
* the first data set ("A") (46,42,44,45,43) represents
measurements of five animals that have been given a
particular treatment and
* the second data set ("B") (52,80,22,30,36) measurements
of five other animals given a different treatment.
* A third set ("C") of five animals was used as controls;
they were given no treatment at all and their
measurements were 20,23,24,19,24. The mean of the
control group is 22 and the standard error is 2.1.
• Did treatment A have a significant effect? Did treatment B?
Null Hypothesis
• In principle, a scientist designs an experiment to disprove,
or not, that an observed effect is due to chance alone. This
is called rejecting the null hypothesis.
• The value p is the probability that there is no difference
between the experimental and the controls; that is, that
the null hypothesis is correct. So if the probability that the
experimental mean differs from the control mean is greater
than 0.05, then the difference is usually not considered
significant. If p = <0.05, the difference is considered
significant, and the null hypothesis is rejected.