AP Lab Skills Guide

advertisement
AP Lab Skills Guide
Data will fall into three categories:
1. Parametric (normal) data
- Normal distribution around mean
- Mean and SD can predict future observations
Ex. Heart rate, plant height, body temp
2. Nonparametric data
- Does not fit normal distribution
- May include large “outliers”
3. Frequency or count data
- Counting how many of an item fit into a category
Ex) Doing a genetic cross (Aa x Aa) and counting how many offspring are AA, Aa and
aa. Data collected as percentages like the percentage of cells in interphase of a root
Ex2)
tip…you are just counting.
AP Lab Skills Guide
Ex2) Data collected as percentages like the percentage of cells in interphase of a root
tip…you are just counting.
AP Lab Skills Guide
Let’s make up a problem and build a histogram, calculate
SD (standard deviation) and SEM (standard error of the
mean)
Let’s say I measured the heart rate of 10 people:
Heart rate (bpm)
76
82
90
73
58
65
63
74
71
68
Mean = 72
(Mean – data point)2
= 16
(72-76)2
= 100
(72-82)2
= 324
(72-90)2
=1
(72-73)2
= 36
(72-58)2
= 49
(72-65)2
= 81
(72-63)2
=4
(72-74)2
=1
(72-71)2
= 16
(72-68)2
Add them up: 628
Now divide by the
sample size minus
one:
628/(9) = 69.8
Lastly, take the sqrt
Sqrt 69.8 = 8.4 (this is the
SD)
72 +/- 8.4
bpm
AP Lab Skills Guide
Let’s make up a problem and build a histogram, calculate
SD (standard deviation) and SEM (standard error of the
mean)
Let’s say I measured the heart rate of 10 people:
Histogram of Resting Heart Rates
Heart rate (bpm)
76
82
90
73
58
65
63
74
71
68
Mean = 72
Number of individuals
3.5
3
2.5
2
1.5
1
0.5
0
51 to 55
56 to 60
61 to 65
66 to 70
71 to 75
76 to 80
81 to 85
Heart rate bins (bpm)
86 to 90
91 to 95
AP Lab Skills Guide
What does the SD tell you?
SD describes the predicted spread or
variation of the measured variable in
the ENTIRE population…
Remember, you only measured a small
sampling…what if you measured
everyone?
Ex) If the mean is 50 and the standard
deviation is 10 (50+/- 10) then 68% of the
population is predicted to be between 40
(50-10) and 60 (50+10).
And…
95% of the population predicted to be
between 2 standard deviations - 30 (50
minus 20) and 70 (50 plus 20).
3.5
3
2.5
2
1.5
1
0.5
0
51 56 61 66 71 76 81 86 91
to to to to to to to to to
55 60 65 70 75 80 85 90 95
AP Lab Skills Guide
Let’s make up a problem and build a histogram, calculate
SD (standard deviation) and SEM (standard error of the
mean)
Let’s say I measured the heart rate of 10 people:
Heart rate (bpm)
76
82
90
73
58
65
63
74
71
68
Mean = 72
(Mean – data point)2
= 16
(72-76)2
= 100
(72-82)2
= 324
(72-90)2
=1
(72-73)2
= 36
(72-58)2
= 49
(72-65)2
= 81
(72-63)2
=4
(72-74)2
=1
(72-71)2
= 16
(72-68)2
Add them up: 628
72 +/- 8.4
bpm
Now explain this data in words
This says that the mean is 72
and that 68% of the entire
population is predicted to have
heart rates between 63.6 and
80.4, and that 95% of the
population is between 55.2 and
88.8
AP Lab Skills Guide
Now lets determine the SEM(SE):
= 8.4/SQRT(10)
= 2.65 bpm
Standard Error of the Mean (SEM or SE)
Standard Deviation (S, SD, σ)
72 +/- 2.65bpm
AP Lab Skills Guide
What does the SEM tell you?
Exactly what it says…it is the predicted
error in the mean itself.
It gives you a range over which the
actual mean of the entire population is
predicted to be.
Again, remember that you only measured a
small sampling…the mean you calculate is
not likely the actual mean…what if you
measured everyone?
Ex) If the mean is 40 and the SEM is 4
then there is a 68% chance (CI) that the
actual mean of the entire population is
between 40+/-4 or between 36 and 44.
There is a 95% chance (CI) that the mean
is between 2SEMs in either direction of
between 32 and 48.
CI = confidence interval
3.5
3
2.5
2
1.5
1
0.5
0
51 56 61 66 71 76 81 86 91
to to to to to to to to to
55 60 65 70 75 80 85 90 95
AP Lab Skills Guide
Now lets determine the SEM(SE):
= 8.4/SQRT(10)
= 2.65 bpm
Standard Error of the Mean (SEM or SE)
Standard Deviation (S, SD, σ)
72 +/- 2.65bpm
Now explain this data in words
This says that the mean is 72
and that there is a 68% chance
that the true mean of the
population is between 69.35 and
74.65, and a 95% chance that it
is between 66.7 and 77.3 bpm.
AP Lab Skills Guide
Let’s sum up the data heart rate data…
MEAN
SD
SEM
72
7
2.65
This tells us that the mean of the OBSERVED SAMPLE
is 72 bpm. The descriptive stats tell us that if the entire
population were to be measured then 68% is predicted
to fall between 65 and 79, and 95% between 58 and 86.
In addition, the true mean of the entire population has a
68% chance of being between 69.35 and 74.65, and a
95% chance that it is between 66.7 and 77.3 bpm.
AP Lab Skills Guide
REVIEW
Standard Deviation (SD, S, σ) describes the range of a
particular variable that is predicted to include 68% of the
population.
Example – 70+/-7 bpm would imply that 68% of the
total population would have heart rates between 63
and 77bpm.
Standard Error of the Mean (SEM, SE) describes the
range where the actual mean of the total population if
predicted to be with 68% confidence.
Example – 70+/-3 bpm would imply that there is a 68%
chance that the actual mean of the total population is
between 67 and 73bpm.
AP Lab Skills Guide
Explain in words the data:
The Shady leaf width has a
mean of ~7.2 and there is a
68% chance that the true
population mean is between 7.0
and 7.6 (looks like SEM is .3)
Statistical significance
Are these two groups (shady
and sunny) significantly
different statistically? Justify
If the means are different and
the error bars do not overlap
then you would predict them
to be significantly different.
AP Lab Skills Guide
AP Lab Skills Guide
AP Lab Skills Guide
AP Lab Skills Guide
Scatterplots
- Comparing two MEASURED VARIABLES
- If a linear relationship is
predicted, a linear regression can
be performed (best fit line; Figure
3) 2
- R (R-squared or coefficient of
determination)
•Typically ranges from 0 to 1
•Describes “goodness of fit” or how
well the line drawn fits the points.
•R2 = 0 implies not relationship
•R2 = 1 implies prefect relationship
(all points on line)
AP Lab Skills Guide
Box-and-Whisker Plots (Boxplot)
- Used with nonparametric data (data that
is not assumed to follow a normal
distribution).
- Vertical lines indicate highest and
lowest points in dataset
- Top of box shows upper quartile
and bottom shows lower quartile.
70
Upper
Quartile
- Horizontal line represents the
median
median
Lower
Quartile
Determining Lower (Q1) and Upper (Q3)
Quartiles:
32
(Q2)
(Q1 - lower)
(Q3 - upper)
You are simply dividing the data into quarters by
medians…the upper quartile is the median of the upper half
of the data and vice versa…
AP Lab Skills Guide
Box-and-Whisker Plots (Boxplot)
Determine the upper and lower quartile of the
sycamore and beech leaf data:
Sycamore:
Median (Q2)
Equals 42
33
35
40
40
44
48
52
63
Lower Quartile(Q1)
Equals 37.5
Upper Quartile (Q3)
Equals 50
Notice how the upper and lower quartile range give you a
sense of the center of the data without the influence of outliers
that might exist in nonparametric data!!
AP Lab Skills Guide
Box-and-Whisker Plots (Boxplot)
Determine the upper and lower quartile of the
sycamore and beech leaf data:
Beech:
Median (Q2)
Equals 42
11
15
19
21
26
32
34
Lower Quartile(Q1)
Equals 37.5
Upper Quartile (Q3)
Equals 50
http://www.brainingcamp.com/resources/math/box-plots/questions.php
AP Lab Skills Guide
Histograms
Used to determine if a given set of measurements, like plant height from art. sel. lab,
approximates a normal distribution (parametric) or if data is nonparametric.
Histogram showing parametric data
Histogram showing NONparametric data
AP Lab Skills Guide
Warning
AP Lab Skills Guide
AP Lab Skills Guide
AP Lab Skills Guide
Download