Math 1 Interpreting Categorical & Quantitative Data II. Name:____________________ Summarize, represent, and interpret data on two categorical and quantitative variables. S-ID.5 Summarize categorical data for two categories in two-way frequency tables. Interpret frequencies in the contexts of the data (including joint, marginal and conditional relative frequencies). Recognize possible associations and trends in the data. Whatever way we spin it, statistics is about numbers. So obviously, it makes sense that statisticians use a lot of numerical data (height, weight, age, etc.), but even that gets too easy after a while. Data that isn't represented numerically is known as categorical data (eye color, hair color, sex, etc.). You should know what to do with categorical data and how to analyze it. You should be able to analyze data from two different categories. For instance, data collected from both men and women about their favorite DC comic book superheroes (Wonder Woman, Batman, or Superman) can be summarized in one table. This table is a two-way frequency table because we can break the data down into 2 categories: male or female (100 of them are male and 100 of them are female), or by favorite superhero (87 prefer Wonder Woman, 63 prefer Batman and 50 prefer Superman). You should also be able to convert this data into a two-way relative frequency table: The numbers in the middle are called joint probabilities because they depend on more than one category or event occurring at the same time. In this case, we want to know if a person is male or female and which superhero they prefer. So written in math language, each entry in the table represents P(Sex & Superhero). The marginal probabilities represent the probability of only one category, P(Sex) or P(Superhero). They're called marginal because they're on the margins of the table. If we know the data for one category and not the other (say, we know the person is male, but not which superhero they prefer), we can calculate the probability that his favorite superhero is Superman. This is called a conditional probability because it is conditional on knowing part of the data. We write this in math language as P(SM|Male). (The | symbol means "given.") We can calculate P(SM|Male) from the frequency table because we know 0.50 of the people surveyed were men, and 0.175 of the people surveyed were both male and preferred Superman. We can use both of these values to determine: Math 1 Interpreting Categorical & Quantitative Data Name:____________________ You should feel comfortable creating, understanding, and using these tables to calculate probabilities for more than two categories. You should also be able to determine the probability of combinations (for instance, P(F) = P(F & WW) + P(F & BM) + P(F & SM) = 0.310 + 0.115 + 0.075 = 0.5) and negations (such as P(F & WW'), meaning the probability that a surveyed person is female and does not prefer Wonder Woman, as P(F & WW') = P(F & BM) + P(F & SM) = 0.115 + 0.075 = 0.190). Practice: 1. Which of the following is categorical data? (A) Height (B) Weight (C) Shoe size (D) Hair color 2. The following table summarizes the hair color of a baseball team. What is the probability that a player has brown hair? (A) 0.48 (B) 0.16 (C) 0.32 (D) 0.03 3. The following table summarizes the hair color of a baseball team. What is the probability that a player does not have gray hair? (A) 0.48 (B) 0.97 (C) 0.16 (D) 0.25 4. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. What is the probability that a student gets an A on the first exam and a B on the second exam? (A) 0.067 (B) 0.1667 (C) 0.333 (D) 0.667 Math 1 Interpreting Categorical & Quantitative Data Name:____________________ 5. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. What is the probability that a student gets a C on the first exam and an A on the second exam? (A) 10% (B) 16.67% (C) 3.33% (D) 50% 6. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. What is the probability that a student gets a B on the second exam? (A) 0.5 (B) 0.667 (C) 0.333 (D) 0.25 7. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. What is the probability that a student gets a C on the first exam? (A) 0.8333 (B) 0.50 (C) 0.25 (D) 0.1667 Math 1 Interpreting Categorical & Quantitative Data Name:____________________ 8. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. Given that a student earns an A on the first exam, what is the probability he or she earns a C on the second exam? (A) 3.33% (B) 16.67% (C) 20% (D) 25% 9. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. Given that a student earns a B on the first exam, what is the probability he earns a B on the second exam as well? (A) 2 out of 3 (B) 1 out of 3 (C) 1 out of 2 (D) 7 out of 20 10. The following table summarizes the number of students in a class that received different letter grades on 2 recent exams. The first exam is shown across the top and is summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2. Given that a student earns a B on the first exam, what is the probability that she does not earn a C on the second exam? (A) 0.50 (B) 0.25 (C) 0.85 (D) 0.15