|PART TWO Essentials| -- Basic Probability Concepts Probability--the likelihood of an event Probability is expressed as a decimal or fraction between zero and one, inclusive. An event that is certain has a probability of 1. An event that is impossible has a probability of 0. If the probability of rain today (R) is 30%, it can be written P(R) = 0.3. Objective probabilities--calculated from data according to generally-accepted methods Relative frequency method--example: In a class of 25 college students there are 14 seniors. If a student is selected at random from the class, the probability of selecting a senior is 14/25 or 0.56. Relative to the number in the class, 25, the number of seniors (frequency), 14, is 56% or 0.56. Subjective probabilities--arrived at through judgment, experience, estimation, educated guessing, intuition, etc. There may be as many different answers as there are people making the estimate. (With objective probability, all should get the same answer.) |Essentials| -- Boolean operations--Boolean algebra--(George Boole, 1815-1864) Used to express various logical relationships; taught as "symbolic logic" in college philosophy and mathematics departments; important in computer design Complementation--translated by the word "not"--symbol: A¯or A-bar Complementary events are commonly known as "opposites." Examples: Heads/Tails on a coin-flip; Rain/No Rain on a particular day; On Time/Late for work Complementary events have two properties Mutually exclusive--they cannot occur together, each excludes the other Collectively exhaustive--there are no other outcomes, the two events are a complete or exhaustive list of the possibilities Partition--a set of more than two events that are mutually exclusive and collectively exhaustive Examples: A, B, C, D, F, W, I--grades received at the end of a course; Freshman, Sophomore, Junior, Senior--traditional college student categories The sum of the probabilities of complementary events, or of the probabilities of all the events in a partition is 1. Intersection--translated by the words "and," "with," or "but"--symbol: or, for convenience, n A day that is cool (C) and rainy (R) can be designated (CnR). If there is a 25% chance that today will be cool (C) and rainy (R), it can be written P(CnR) = 0.25. Intersections are often expressed without using the word "and." Examples: "Today might be cool with rain." or "It may be a cool, rainy day." Two formulas for intersections: For any two events A and B: P(AnB) = P(A|B)*P(B) ("|" is defined below.) For independent events A and B: P(AnB) = P(A)*P(B) (This will appear later as a test for independence). This formula may be extended to any number of independent events P(AnBnCn . . . X) = P(A)*P(B)*P(C)* . . . P(X), where X is any letter The intersection operation has the commutative property P(AnB) = P(BnA) "Commutative" is related to the word "commute" which means "to switch." The events can be switched without changing anything. In our familiar algebra, addition and multiplication are commutative, but subtraction and division are not. Intersections are also called "joint probabilities" (Joint in the sense of "together.") Union--translated by the word "or"--symbol: or, for convenience, u A day that is cool (C) or rainy (R) can be designated (CuR). If there is a 25% chance that today will be cool (C) or rainy (R), it can be written P(CuR) = 0.25. Unions always use the word "or." Addition rule to compute unions: P(AuB) = P(A) + P(B) - P(AnB) The deduction of P(AnB) eliminates the double counting that occurs when P(A) is added to P(B). The union operation is commutative P(AuB) = P(BuA) Condition--translated by the word "given"--symbol: | A day that is cool (C) given that it is rainy (R) can be designated (C|R). The event R is called the condition. If there is a 25% chance that today will be cool (C) given that it is rainy (R), it can be written P(C|R) = 0.25. Conditions are often expressed without using the word "given." Examples: "The probability that it will be cool when it is rainy is 0.25." [P(C|R) = 0.25.] "The probability that it will be cool if it is rainy is 0.25." [P(C|R) = 0.25.] "25% of the rainy days are cool." [P(C|R) = 0.25.] All three of the above statements are the same, but this next one is different: "25% of the cool days are rainy." This one is P(R|C) = 0.25. The condition operation is not commutative: P(A|B) ≠ P(B|A) For example, it is easy to see that P(rain|clouds) is not the same as P(clouds|rain). Conditional probability formula: P(A|B) = P(AnB) / P(B) |Essentials| -- Occurrence Tables and Probability Tables Occurrence table--table that shows the number of items in each category and in the intersections of categories Can be used to help compute probabilities of single events, intersections, unions, and conditional probabilities Probability table--created by dividing every entry in an occurrence table by the total number of occurrences. Probability tables contain marginal probabilities and joint probabilities. Marginal probabilities--probabilities of single events, found in the right and bottom margins of the table Joint probabilities--probabilities of intersections, found in the interior part of the table where the rows and columns intersect Unions and conditional probabilities are not found directly in a probability table, but they can be computed easily from values in the table. Two conditional probabilities are complementary if they have the same condition and the events before the "bar" (|) are complementary. For example, if warm (W) is the opposite of cool, then (W|R) is the complement of (C|R), and P(W|R) + P(C|R) = 1. In a 2 x 2 probability table, there are eight conditional probabilities, forming four pairs of complementary conditional probabilities. It is also possible for a set of conditional probabilities to constitute a partition (if they all have the same condition, and the events before the "bar" are a partition). |Essentials| -- Testing for Dependence/Independence Statistical dependence Events are statistically dependent if the occurrence of one event affects the probability of the other event. Identifying dependencies is one of the most important tasks of statistical analysis. Tests for independence/dependence Conditional probability test--posterior/prior test Prior and posterior are, literally, the Latin words for "before" and "after." A prior probability is one that is computed or estimated before additional information is obtained. A posterior probability is one that is computed or estimated after additional information is obtained. Prior probabilities are probabilities of single events, such as P(A). Posterior probabilities are conditional probabilities, such as P(A|B). Independence exists between any two events A and B if P(A|B) = P(A) If P(A|B) = P(A), the occurrence of B has no effect on P(A) If P(A|B) ≠ P(A), the occurrence of B does have an effect on P(A) Positive dependence if P(A|B) > P(A) -- posterior greater than prior Negative dependence if P(A|B) < P(A) -- posterior less than prior Multiplicative test--joint/marginal test Independence exists between any two events A and B if P(AnB) = P(A)*P(B) Positive dependence if P(AnB) > P(A)*P(B) -- intersection greater than product Negative dependence if P(AnB) < P(A)*P(B) -- intersection less than product |Essentials| -- Bayesian Inference Thomas Bayes (1702-1761) Bayes developed a technique to compute a conditional probability, given the reverse conditional probability Computations are simplified, and complex formulas can often be avoided, if a probability table is used. Basic computation is: P(A|B) = P(AnB) / P(B), an intersection probability divided by single-event probability. That is, a joint probability divided by a marginal probability. Bayesian analysis is very important because most of the probabilities upon which we base decisions are conditional probabilities. |Essentials| -- Other Probability Topics: Matching-birthday problem Example of a "sequential" intersection probability computation, where each probability is revised slightly and complementary thinking is used Complementary thinking--strategy of computing the complement (because it is easier) of what is really sought, then subtracting from 1 Redundancy Strategy of using back-ups to increase the probability of success Usually employs complementary thinking and the extended multiplicative rule for independent events to compute the probability of failure. P(Success) is then equal to 1 - P(Failure). Terminology--Probability--explain each of the following: probability, experiment, event, simple event, compound event, sample space, relative frequency method, classical approach, law of large numbers, random sample, impossible event probability, certain event probability, complement, partition, subjective probability, occurrence table, probability table, addition rule for unions, mutually exclusive, collectively exhaustive, redundancy, multiplicative rule for intersections, tree diagram, statistical independence/dependence, conditional probability, Bayes' theorem, acceptance sampling, simulation, risk assessment, Boolean algebra, complementation, intersection, union, condition, marginal probabilities, joint probabilities, prior probabilities, posterior probabilities, two tests for independence, triad, complementary thinking, commutative. Skills and Procedures--given appropriate data, prepare an occurrence table prepare a probability table compute the following 20 probabilities 4 marginal probabilities (single simple events) 4 joint probabilities (intersections) 4 unions 8 conditional probabilities--identify the 4 pairs of conditional complementary events identify triads (one unconditional and two conditional probabilities in each triad) conduct the conditional (prior/posterior) probability test for independence / dependence conduct the multiplication (multiplicative) (joint/marginal) test for independence / dependence identify positive / negative dependency identify Bayesian questions use the extended multiplicative rule to compute probabilities use complementary thinking to compute probabilities compute the probability of "success" when redundancy is used Concepts-give an example of two or more events that are not mutually exclusive give an example of two or more events that are not collectively exhaustive give an example of a partition--a set of three or more events that are mutually exclusive and collectively exhaustive express the following in symbolic form using F for females and V for voters in a retirement community 60% of the residents are females 30% of the residents are female voters 50% of the females are voters 75% of the voters are female 70% of the residents are female or voters 30% of the residents are male non-voters 25% of the voters are male 40% of the residents are male identify which two of the items above are a pair of complementary probabilities identify which two of the items above are a pair of complementary conditional probabilities from the items above, comment on the dependency relationship between F and V if there are 100 residents, determine how many female voters there would be if gender and voting were independent explain why joint probabilities are called "intersections"? identify which two of our familiar arithmetic operations and which two Boolean operations are commutative tell what Thomas Bayes is known for (not English muffins)