4. Probability as Relative Frequency

advertisement
Statistics 312 – Uebersax Course page:: www.john-uebersax.com/stat312
09 Probability Theory (more) & Graphs (more)
Old Business
- Copies of textbook on reserve
- Correction: BINOMDIST
- Fermat's theorem
- Why Use (n-1) for Sample Variance?
New Business
- Probability as relative frequency
- Conditional probability example
- Pascal's triangle (combinations)
- Box-and-whisker plots
- Five-number summaries
1. Excel Function BINOMDIST
The format is =BINOMDIST(k, n, p, cumul). In the last lecture I mistakenly reversed the
order of k and n.
2. Fermat's Last Theorem (optional)
See separate handout.
3. Why Use (n – 1) for Sample Variance
See proof on separate handout.
4. Probability as Relative Frequency
Ball-and-Urn Problem
You have an 'urn' (a large jar) filled with red and black marbles or balls:
Statistics 312 – Uebersax Course page:: www.john-uebersax.com/stat312
09 Probability Theory (more) & Graphs (more)
Suppose there are 70 black and 30 red balls in urn (urn on right).
Draw 1 ball at random.
What is the probability (p) that the ball will be red?
One way to define probability is as the relative frequency of a target event in the
population/sample.
Relative frequency = no. of target events / no. of all possible events.
Answer: 30 target events (red balls) / 70 possible events (all balls) = 30/70 = p.
Application to Conditional Probability
Remember that the conditional probability of B given A is
The concept of relative frequency helps us understand how this formula works.
Example: A village in Siberia has snow at the following rates during 'winter':
No. snow days
No. of days
Dec.
7
31
Jan.
15
31
Feb.
8
28
Probability of snow on a randomly selected day in this period:
= Pr(snow)
= no. of snow days / no. of days = 30/90 = 1/3 = .333.
Conditional probability of snow given that it's January
= Pr(snow|Jan.)
= no. of snow days in Jan. / no. of days in Jan. = 15/31 = .484.
Total
30
90
Statistics 312 – Uebersax Course page:: www.john-uebersax.com/stat312
09 Probability Theory (more) & Graphs (more)
5. Pascal's Triangle
Pascal's (1623–1662) triangle is a simple method to compute the binomial coefficient,
i.e., an alternative to the formula:
How it works: coefficients in lower rows are produced by adding two adjacent
coefficients in the row above:
1. For each row, first and last value is always 1.
2. (Therefore), first two rows contain all 1's.
3. Starting with third row, a value is produced by adding the two numbers from the line
above to its left and right.
4. Example, in row 3, the "2" is produced by adding 1 (above it left) and 1 (above it right).
Statistics 312 – Uebersax Course page:: www.john-uebersax.com/stat312
09 Probability Theory (more) & Graphs (more)
6. Box-and-Whisker Plots
An easy way to characterize properties of a data distribution:
Can be used to detect skewness and other distributional shapes:
Reading: pp. 123-125, pp. 117-119 (Review 'Five-Number Summary')
Box-and-Whisker Plots with JMP
Instructions here:
http://web.utk.edu/~cwiek/201Tutorials/SideBySideBoxPlots/
Download