Day 2

advertisement
Goal 1: To help practicing teachers get a fundamental
understanding of t-tests and p-value in a short amount of time
and apply it to biology data they obtain, without “too much
math.”
2-sample t-tests
p-value approach
Sampling distributions & CLT
Continuous probability distribution
Probability Fundamentals
Goal 2: Learn to use TI-84 Graphing Calculator to do statistical
analysis and find p-values and understand what the results mean.
AIMS – statistics workshop Day 2 page 1
Goal for Day 2:
Develop an intuitive understanding of the binomial and the normal
probability distributions, de-emphasizing math formulas;
Relate normal distributions to teaching and the biology experiment;
Get comfortable with characteristics of normal distribution,
such as spread (in St. Dev.), and probability is an area.
Illustrate calculator usage for finding probabilities of probability
distributions.
Develop understanding of what statistically significant means in
terms of probability.
AIMS – statistics workshop Day 2 page 2
Illustration of Probability Simulator on the TI-84 using
the TI-Smart View software
Question:
Is it better to use actual manipulatives (like spinners and
dice) with students, or software simulations? Does the
age of the student matter?
AIMS – statistics workshop Day 2 page 3
Recall:
A Probability Distribution is a list of the outcomes from
an experiment along with their respective probabilities.
Experiment: toss 2 coins
Probability Distribution:
Heads
0
1
2
P(H)
1/4
1/2
1/4
The only rules for a probability distribution is that each
probability must be a number between 0 and 1, and the
sum of all the probabilities must add to 1 (or 100%).
AIMS – statistics workshop Day 2 page 4
Probability distributions may result from discrete or
continuous data. We will look at a discrete probability
distribution - the binomial distribution. Then we will look
at a continuous probability distribution - the normal
distribution.
Probability Distributions
Discrete
* Binomial
Geometric
Poisson
Continuous
* Normal
Chi-Square
AIMS – statistics workshop Day 2 page 5
What is discrete data? What is continuous data?
discrete
before
continuous
Discrete = counts
AIMS – statistics workshop Day 2 page 6
Continuous = measurements
Discrete or continuous?
1. heights of third-graders in your class
2. number of students in each grade
3. number of siblings for each child
4. weights of students
5. teacher’s salary schedule
AIMS – statistics workshop Day 2 page 7
spinners, coins, dice = discrete data
All of the examples of probability distributions
from Day 1 were discrete.
When an experiment (with discrete data) has
exactly two outcomes, the probability is constant
AIMS – statistics workshop Day 2 page 8
for each trial, and there are a fixed number of
trials, the experiment is called binomial.
Tossing coins is a (discrete) binomial experiment.
1st toss
2nd toss
H
H
T
H
T
T
AIMS – statistics workshop Day 2 page 9
Notice that binomial experiments have two branches for each trial.
Experiment: toss 2 coins
Probability Distribution:
Heads
0
1
2
P(H)
1/4
1/2
1/4
Bar graph for probability distribution
1
AIMS – statistics workshop Day 2 page 10
Notice the sum of
the areas of the
bars = 1
1/2
0
0
1
2
The notion that the sum of the areas of the bars in a
probability distribution is equal to 100% is a key
concept in statistics!
AIMS – statistics workshop Day 2 page 11
The experiment of rolling 2 dice is NOT binomial, and you
easily see that there are more than two branches in the
tree diagram.
Ex: Roll 2 dice
1st die
1
2nd die
1
2
3
4
5
6
2
AIMS – statistics workshop Day 2 page 12
3
4
5
6
Experiment: roll 2 dice
Probability Distribution:
sum
P(sum)
1
2
3
4
5
0
1/36
2/36
3/36
4/36
AIMS – statistics workshop Day 2 page 13
6
7
8
9
10
11
12
13
5/36
6/36
5/36
4/36
3/36
2/36
1/36
0
However, it is a probability distribution, and here’s the
bar graph for rolling two dice. The sum of the areas of
the bars = 100%.
AIMS – statistics workshop Day 2 page 14
6/36
5/36
4/36
3/36
2/36
1/36
2 3 4 5 6 7 8 9 10 11 12
BINOMIAL EXPERIMENT
Requirements:
AIMS – statistics workshop Day 2 page 15
1.
2.
3.
4.
outcomes classified into two categories
has a fixed number of trials, n
each trial is independent
probability of success stays constant for each trial, p
Notation:
n = number of trials
x = desired number of successes
p = probability of success on single trial
Which of the following are binomial experiments?
1. Tossing an unbiased coin 500 times.
2. Tossing a biased coin 500 times.
AIMS – statistics workshop Day 2 page 16
3. Surveying 500 consumers to find the brands of toothpaste
preferred.
4. Surveying 500 consumers to determine whether their preferred
brand of toothpaste is Brand X.
5. Polling 500 voters on the Presidential election from a population of
200,000 voters if 35% are Republican and 65% are Democratic.
6. Polling 1000 voters in the presidential election from a population of
8 million voters, of which 40% are Democrats, 35% are
Republicans, and 25% are Independent.
7. Firing 20 missiles at a target with a hit rate of 90%.
8. Testing a sample of eight drug dosages from a population of 5000,
of which 2% are contaminated.
9. Administering a driving test to 50 license applicants with a passing
rate of 72%.
QUIZ
Number your paper 1-5.
AIMS – statistics workshop Day 2 page 17
Multiple choice: answer A, B, C, D
We are going to look at the probability distribution for
the number of right answers.
AIMS – statistics workshop Day 2 page 18
# right
0
1
2
3
4
5
Prob(# right)
How do we find the probabilities for the table?
We could use tree diagrams . . . very messy . . .
item 1
2
3
4
R
R
W
5R
W
R
W
AIMS – statistics workshop Day 2 page 19
R
R
W
R
W
R
W
R
W
W
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
R
W
or we could use rules of probability to find
W probabilities .
. . . very tedious . . .
AIMS – statistics workshop Day 2 page 20
Ex: Prob(exactly 1 right) = P(RWWWW) or P(WRWWW)
or P(WWRWW) or P(WWWRW) or P(WWWWR) =
¼•¾•¾•¾•¾+¼•¾•¾•¾•¾+¼•¾•¾•¾•¾+¼•
¾•¾•¾•¾+¼•¾•¾•¾•¾=
And we’d have to do this for each entry in the table!!
Note: the binomial formula is a generalization of the
rules of probability:
n!
. px . (1– p)n-x
P(x) =
x!(n–x)!
Another, easier alternative is to let the TI-84 calculator
“do the math.”
AIMS – statistics workshop Day 2 page 21
DIST –> binompdf(n, p, x)
# right
0
1
2
3
4
5
Prob(# right)
And the graph of the distribution might look like this:
AIMS – statistics workshop Day 2 page 22
0
1
2
3
4
5
But the point here is NOT to get bogged down in
calculator distributions (although you may want to
explore them further on your own), and not to get bogged
down in the math, but rather to understand this:
AIMS – statistics workshop Day 2 page 23
The graphs of probability distributions show 100% of
the probability, which is equivalent to the sum of
the areas of the bars. The area of each bar
represents the probability for the particular value of
the random variable.
With this main idea in mind, let’s move on to a continuous
probability distribution, the normal distribution.
NORMAL DISTRIBUTION
AIMS – statistics workshop Day 2 page 24
Characteristics:
1. Continuous data (or data treated as continuous)
2. Symmetric
3. Virtual range (99.8% of data) within 3 standard
deviations each side of mean (a spread of 6 st. dev.)
4. Total area under the curve = 100%
5. Probability = area under curve between 2 data values
Notation:
x = mean
s = standard deviation
AIMS – statistics workshop Day 2 page 25
With continuous data, we can’t make a table as we did for
discrete data. Instead, we draw the graph of the distribution,
but instead of a bar graph, we use a frequency curve.
We are going to do two problems, just to give you a sense of
how probability is related to the normal distribution.
AIMS – statistics workshop Day 2 page 26
Heights of women are normally distributed with a mean of
63.6 in. and a standard deviation of 2.5 in. (based on
information from the National Health Survey). The U.S.
Army requires women's heights to be between 58 in. and 80
in. Find the percentage of women meeting that height
requirement. Are many women being denied the opportunity
to join the Army because they are too short or too tall?
To solve the problem, we need to know the area (which is the
probability) between 58 and 80 under the curve.
AIMS – statistics workshop Day 2 page 27
In calculus-based statistics, you integrate to find the area
between two points under a curve. CALCULUS!! Argh!!
We are going to let the TI-84 calculator “do the math.”
DISTR –>normalcdf (lower, upper, x , s)
AIMS – statistics workshop Day 2 page 28
Cans of regular Coke are labeled as containing 12 oz. The
contents are normally distributed with a mean of 12.19 oz.
and a standard deviation of 0.11 oz. (based on information
from the Coca-Cola Co.). What percentage of cans contain
less than the 12 oz. Printed on the label?
For this problem, the question of interest is: Are many
consumers being cheated?
AIMS – statistics workshop Day 2 page 29
The Normal Probability Distribution helps us decide what
results are typical, and what results are unusual, i.e.,
If you knew the heights of plants are normally
distributed (mean 13, st. dev. 2), then how unusual would
it be to get a plant that grew 22 inches tall?
AIMS – statistics workshop Day 2 page 30
Now, a little twist on our thinking . . . what if you are
told the probability of winning Lotto is 1/13000000, and
yet you buy a ticket and . . . YOU WIN?
Is it significant?
In Statistics, we use the word significant to describe
getting a result or outcome that we didn’t expect to get
(because the probability was so small).
So a plant that grows to 22 inches, when the heights of
plants are normally distributed with a mean 13 and a st.
dev. 2, is statistically significant because the probability
of getting such a plant is low.
AIMS – statistics workshop Day 2 page 31
Connected Math – page 8
Quality ratings for natural and regular peanut butter
If you make a decision based on looking at graphs, you can’t be
sure if the differences you see are typical sampling
fluctuation, or a significant difference in quality ratings.
Statistics helps you find the probability, which helps you make
an informed decision!
AIMS – statistics workshop Day 2 page 32
Suppose two teachers compare their student's ISAT scores,
and see that one of the classes has lower scores than the
other class. Should the teacher whose class has lower scores
be concerned?
What if the two teachers were told that these results were
just random sampling fluctuation, and that there was no
statistically significant difference between the classes?
AIMS – statistics workshop Day 2 page 33
Or maybe a teacher notices that a many of her students are
heavier than students in the other classes of the same grade.
Should she be worried about her class being overweight?
What if she were told that the probability of having a class
with this average weight was very unusual, i.e., statistically
significant. Now should she worry?
Explain what it means to you now when you hear the words
“statistically significant.”
AIMS – statistics workshop Day 2 page 34
OK, so the idea of knowing probabilities can help us make good
decisions.
And if an outcome is statistically significant, that means you
got a result which had a low probability of occurring.
And we have some understanding of the normal distribution
and finding probabilities using it.
But now you might be thinking … not all data is normal. The
plant data you will be collecting is not necessarily normal. So
what’s the fuss about knowing the normal distribution????
- stay tuned for tomorrow’s lesson!
AIMS – statistics workshop Day 2 page 35
Download