BIL 151: Enzymes & Enzymatic Reactions Data Analysis

advertisement
BIL 151: Enzymes & Enzymatic Reactions
Data Analysis
Once your team has collected an adequate sample size of raw data, you will be ready to do
some calculations. This lab chapter provides a brief guide.
I. Data Analysis
Since you are comparing rates of reaction between two experimental groups, it is important
to understand how to properly calculate and present reaction rates. You will then be able to
perform a statistical test to see whether the average reaction rates of your two experimental
groups are significantly different. Once your calculations are finished and your results are
clear, your team must meet to discuss how to explain your observations, logically and
completely, in your presentation.
A. Calculating the Rate of Reaction
Biological functions may take any number of shapes. In the case of the reaction rate of
catalase breakdown of hydrogen peroxide, the function will form a linear relationship while the
reaction is in progress. To analyze your results, you will calculate the slope of the linear
functions. A slope, with units of y over x, is an expression of rate. A rate is an expression of
change over an independent variable such as time. "Miles per hour," "millimeters per second,"
and "pizzas per semester" are all expressions of rate.
When you measured the volume of oxygen gas generated during the breakdown of
hydrogen peroxide by catalase, you obtained raw data (i.e., data straight from a measuring
apparatus, which have not undergone any type of mathematical transformation) similar to that
shown in Table 1.
Table 1. The cumulative volume of oxygen generated by the hydrolysis of
hydrogen peroxide by catalase under control conditions (pH 7.0, 25oC)
time
(seconds)
zero
5
10
15
20
25
30
35
40
45
50
55
60
cumulative O2 volume
(cc)
0
5
12
20
37
44
50
56
64
72
75
78
82
Each of the data points consists of an independent (time) and a dependent (cc O2)
variable. These are specific coordinates, corresponding to x and y on a graph. The
coordinates of the data in Table 1, from top to bottom, are (0,0), (5,5) (10,12), (15, 20), (20,37),
Data Analysis - 1
(25, 44), (30,50), (35,56), (40, 64), (45, 72), (50,75), (55,78) and (60,82).
(remember: data is the plural of datum) are plotted in Figure 1.
These data
B. Slope of the Line is Equal to the Reaction Rate
The horizontal axis of a graph is known as the abscissa or x-axis. It is labeled with the
units of the independent variable. The independent variable is so named because although it
changes over the course of the experiment, it is not affected by changes in the experiment.
Time is a commonly used independent variable. Its units may be seconds, minutes, hours,
months, years, etc.
The vertical axis is known as the ordinate or y-axis. It is labeled with the units of the
dependent variable, which changes depending upon the progression of the independent
variable. An example of a dependent variable is the change in oxygen volume generated by a
chemical reaction over time (independent variable).
Notice in Figure 1 that the straightest part of the function does not pass through (0,0).
Evidently, this particular experimental run started slowly, then increased to a more consistent
rate. The dotted line in the figure shows a somewhat "J" shaped relationship at the beginning
of the experiment. The best fit line in Figure 1 does not necessarily pass through every data
point. Rather, it should reflect the rate of the reaction at its optimum.
To calculate the slope of a line (which corresponds to rate of reaction), determine the
change in y (Δy) and divide it by the corresponding change in x (Δx). Because y results in a
vertical change and x results in a horizontal change, you may recall that the calculation of
slope is sometimes referred to as the calculation of "rise" over "run." Your rate will be
expressed as the units of y over the units of x.
rate = slope = Δy
Δx
Figure 1. Cumulative volume of oxygen generated by the hydrolysis of
hydrogen peroxide by catalase under control conditions (pH 7.0, 25oC).
Data Analysis - 2
In our example, the distance of rise (O2 generated) is plotted against time. Thus, the units
of the rate are expressed in mm O2 (y axis) per second (x axis), or more simply, mm O2/sec.
Because the slope of a straight line is the same no matter where it is measured, choose
any two corresponding values of x and y.
Once you have plotted your data points, study their relationship. Do they form a straight
line? An "S" curve? A "J" curve? A parabola? Fortunately, you wonʼt have to do any
guesswork, as the Vernier software you used to collect your data will also calculate the rate of
your reaction. But if you were to determine the function by hand and eye, it would be important
to note that itʼs not as simple as "connect the dots." Notice whether the reaction started slowly,
picked up speed, and then leveled off. If this is the case, then your rate calculation will be less
accurate if you include the more horizontal “start up” and “taper off” portions of such an "S"
shaped curve. To calculate the rate of a reaction that is linear, use the points of the function
that best approximate a straight line, when the reaction is proceeding at its maximum rate.
1. A best fit line through the data points has already been drawn. Notice that the line passes
near (but not necessarily through) the points that appear to be most linear with respect to
each other. Although the first data point should occur at (0,0), the best fit line does not
pass through it, apparently because the reaction did not begin immediately at its most
consistent rate (i.e., it took a moment to really get going).
2. Choose any two points along the line and determine their coordinates. For example,
coordinates (35, 56) and (50, 75).
3. Subtract the smaller y value from the larger y value. This quantity is Δy, or "rise." In our
example, Δy = 75 - 56 = 19.
3. Next, subtract the smaller x value from the larger x value. This quantity is Δx, or "run." In
our example, Δx = 50 - 35 = 15.
4. Divide rise by run (Δy/Δx), being certain to include the units of each variable. The result is
the slope of the line, which is equal to the rate of the reaction.
5. In our example, slope = 19 mm O2/15 seconds = 1.3 mm O2/sec.
Your team no doubt ran several experimental trials for each of your variables. You should
calculate a rate for each experimental trial in each of your groups (e.g., treatment and control).
These rates can then be analyzed with a studentʼs t-test, which will tell you whether there is a
significant difference between the mean rate of reaction between your experimental groups.
C. Statistical Testing
Probability calculations form the basis of one of the scientist's most important tools: the
statistical test. Once data have been collected, it's not enough to merely "eyeball" them and
say, “Eeeyup. This is different from what we expected! Something weird is going on here!"
Investigators use statistics generated from their data sets to determine the likelihood that
their results differ sufficiently from the expected results to conclude they are unlikely to have
arisen as a matter of chance. Over the decades, many different probability distributions
have been devised by mathematicians, each one appropriate for different types of data.
Enough statistical tests and their associated probability distributions have been invented to
fill many textbooks. Some of these, such as the Chi-square test, the Student t-test, the
Data Analysis - 3
Analysis of Variance (ANOVA), the Mann-Whitney U test and the Fisher's exact test may
sound familiar to you. The specific probability distribution and statistical test appropriate in a
given situation depend upon the type of data collected and the nature of your hypothesis.
One oft-utilized probability distribution is Student’s t-distribution, used to determine
whether the observed difference between the means of two samples is unlikely to have arisen
if they were in fact drawn from the same population (or from populations with identical means).
there is a significant difference between the (continuous numerical) means of two groups
under study. To make a very long and complex story short, an investigator can use the mean,
variance, and standard deviation of his/her data sets to calculate a t-statistic. Every possible
value of the t-statistic is linked to a certain probability that the observed difference in sample
means is simply as a matter of chance.
1. Calculation of mean, variance and standard deviation
Your rate of reaction means are a form of continuous numerical data. To analyze them
correctly, you will need to determine their values for several important quantities:
x = data point
the individual values of a measured parameter (=xi)
_
x = mean
the average value of a measured parameter
n = sample size
the number of individuals in a particular test group
df = degrees of freedom the number of independent quantities in a system
s2 = variance
a measure of individual data points' variability from the mean
s = standard deviation the positive square root of the variance
To calculate the mean rate of reaction of either the treatment or control group, sum the
rates of all individual trials in a particular group and divide it by the number of trials.
_
x
=
Σ xi
n
i=1
n
When you are studying some measurable aspect of a sample of a population (such as the
index : ring finger ratio), it is important to understand how much variation around the mean
your sample exhibits. In biological systems in particular, there is almost always a great deal of
variation around the mean. Variability is part of nature and there is nothing "wrong" with it. In
many biological studies, the estimation of variances is as important, if not more important, than
the mean. Natural variability is part of life, and understanding how biological systems vary is
very important to measure and understand.
Measurements of dispersion around the mean include the range, variance and standard
deviation. The simplest of these is the range, which is defined as the highest value minus the
lowest value. Unfortunately, the greater the sample size, the greater the range, and because it
employs essentially only the two extreme values, a great deal of information about variation
between those extremes is lost.
More useful are the variance and standard deviation, which are measures of deviations
from the mean.
Data Analysis - 4
The variance (s2) is calculated as
(If you're not sure what the symbols mean, go back and review the formula for the mean.)
The standard deviation (s) is the square root of the variance, and is calculated as
2. The Student’s t-test
The Student’s t-test can be used to determine whether a difference between two means is
significant. Note that “significant” in this sense is NOT the same as “biologically meaningful.”
It refers only to whether the observed difference is unlikely to be due to chance (“statistically
significant”). These means may be calculated from observations that are either paired (as
when individuals in a single group are subjected to "before and after" measurements, and data
points are paired for each tested individual) or independent (as when individuals in two similar
sample populations are measured, but each individual in each sample population is measured
only once). Slightly different calculations of the t-statistic must be used in each case.
If you take "before" and "after" measurements from the same experimental system, the two
values obtained would not be independent of one another. Rather, they would be paired.
Statistically, paired data must be analyzed differently than independent samples.
In a paired sample t-test, the separate means of two different sample populations is not
measured. Instead, the difference between the first measurement and second measurement
of the same individual is calculated and used to generate a t-statistic. In the paired sample ttest, the underlying hypothesis is that the mean difference among all samples is zero.
In your experiment, you ran each trial with new reagents in your two different systems (e.g.,
treatment and control). Thus, the means of your two sample populations (e.g., treatment and
control) are not paired. They are independent because a single rate is calculated for each
unique trial run. An independent sample t-test is appropriate for analysis of this type of data.
In the independent sample t-test, the underlying hypothesis is that the difference between the
two sample means is zero. (This is a subtle, but crucial, difference from the underlying
hypothesis of the paired sample t-test.)
Paired designs are always best because they eliminate the added influence of a difference
in means arising from variation between samples due solely to their containing different
individuals. However, in some experiments—including the ones you performed with yeast and
hydrogen peroxide--is it simply not possible to use a paired design because the individual is
permanently changed in the process of the experiment. To analyze your data, you will use an
independent sample t-test.
The critical values for the t-distribution are the same for either paired or independent
samples, and the table of critical values (Table 2) can be used for either one to determine the
P value associated with your t-statistic at the degrees of freedom in your system.
Data Analysis - 5
Use the independent sample t-test to calculate a t-statistic for your two means:
...in which x1 and and x2 are the means of your two groups, n1 and n2 are the
numbers of trials you ran in each group, and sp2 is the pooled variance.
Pooled variance is calculated as:
...in which s12 is the variance of group 1, s22 is the variance of group 1, df1 is the
degrees of freedom for group 1 (df1 = n1 - 1) and df2 is the degrees of freedom
for group 2 (df2 = n1 - 1)
The degrees of freedom for a two-sample t-test with independent means is calculated as
the sum of the degrees of freedom of each test group:
df = (n1 - 1) + (n2 - 1)
What is your t-statistic?
What are your degrees of freedom?
What is the P value associated with your t-statistic and degrees of freedom? (Use the table of
critical values for the t-test from Lab #1.)
> P >
D. Drawing Conclusions
When all teams have analyzed their data and drawn conclusions about their results, your
lab instructor will lead a class discussion. Answer the following carefully.
Does your P value indicate that your two means are sufficiently different from one another that
the difference is probably not due to chance?
Do you accept or reject your null hypothesis?
Briefly, what is your groupʼs conclusion about your experiment and original question you posed
about your experimental system?
Data Analysis - 6
II. Data Presentation
In a scientific presentation, a grid containing values corresponding to data is known as a
table. A photograph, line drawing or graph is known as a figure. Either type of graphic must
be properly labeled (as a Table or a Figure) and informative legend. Note that the legend of a
table is placed at the top of the graphic, whereas the legend of a figure is placed below.
In our example above, Table 1 and Figure 1 say exactly the same thing. To include both
representations of the same data in your presentation would be redundant. Choose only one.
In this case, the figure provides more information, and is the better choice.
If you ran replicate experiments in which you varied a single factor to determine whether
that factor affected reaction rate, you must calculate reaction rate (slope) for each experimental
trial. To graphically represent the difference between your two experimental groups, you might
wish to create a figure in which you plot reaction rate as the dependent variable (y axis)
against the independent variable (e.g., temperature, pH, chemical concentration) for each
group (e.g., treatment and control). The resulting figures should show a relationship between
reaction rate and the variable.
If you are comparing reaction rate in a treatment and control system, then your figure will
be more informative if both curves (treatment and control) are shown on the same graph, for
comparison. Be sure to differentiate them clearly, using different symbols, colors, or whatever
distinction your team thinks is effective.
1. The purpose of a figure is to allow your readers to more easily comprehend your data.
Label your axes clearly with the appropriate units of measure. Figures in a PowerPoint
presentation or a poster should be large, clearly labeled, and central to the presentation.
2. Each figure must be numbered and be accompanied by a descriptive legend placed
underneath the figure. In a scientific paper, all figures must be referred to in the text of the
paper. In a PowerPoint or Poster presentation, the figures should stand alone.
More information about the nature of your research symposium presentation can be found in
the next chapter of your online lab manual.
Data Analysis - 7
Table 2-2. Table of critical values for the two-sample t-test. The P levels (0.05) indicating rejection of
the null hypothesis are shown in bold for both one-tailed and two-tailed hypotheses. (From Pearson
and Hartley in Statistics in Medicine by T. Colton, 1974. Little, Brown and Co., Inc. publishers.)
2-tail -->
1-tail -->
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
0.10
0.05
0.05
0.25
0.02
0.01
0.01
0.005
0.001
0.0005
6.314
2.920
2.353
2.132
2.015
1.934
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
636.619
31.598
12.941
8.610
6.859
5.959
5.405
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.850
3.819
3.792
3.767
3.745
3.725
3.707
3.690
3.674
3.659
3.646
Data Analysis - 8
Download