Math 1Interpreting Categorical & Quantitative Data Name

advertisement
Math 1 Interpreting Categorical & Quantitative Data
II.
Name:____________________
Summarize, represent, and interpret data on two categorical and
quantitative variables.
S-ID.5 Summarize categorical data for two categories in two-way frequency
tables. Interpret frequencies in the contexts of the data (including joint,
marginal and conditional relative frequencies). Recognize possible
associations and trends in the data.
Whatever way we spin it, statistics is about numbers. So obviously, it makes sense that
statisticians use a lot of numerical data (height, weight, age, etc.), but even that gets too easy
after a while. Data that isn't represented numerically is known as categorical data (eye color,
hair color, sex, etc.).
You should know what to do with categorical data and how to analyze it. You should be
able to analyze data from two different categories. For instance, data collected from both men
and women about their favorite DC comic book superheroes (Wonder Woman, Batman, or
Superman) can be summarized in one table.
This table is a two-way frequency table because we can break the data down into 2 categories:
male or female (100 of them are male and 100 of them are female), or by favorite superhero (87
prefer Wonder Woman, 63 prefer Batman and 50 prefer Superman).
You should also be able to convert this data into a two-way relative frequency table:
The numbers in the middle are called joint probabilities because they depend on more
than one category or event occurring at the same time. In this case, we want to know if a person
is male or female and which superhero they prefer. So written in math language, each entry in
the table represents P(Sex & Superhero).
The marginal probabilities represent the probability of only one category, P(Sex) or
P(Superhero). They're called marginal because they're on the margins of the table.
If we know the data for one category and not the other (say, we know the person is male, but not
which superhero they prefer), we can calculate the probability that his favorite superhero is
Superman. This is called a conditional probability because it is conditional on knowing part of
the data. We write this in math language as P(SM|Male). (The | symbol means "given.")
We can calculate P(SM|Male) from the frequency table because we know 0.50 of the people
surveyed were men, and 0.175 of the people surveyed were both male and preferred Superman.
We can use both of these values to determine:
Math 1 Interpreting Categorical & Quantitative Data
Name:____________________
You should feel comfortable creating, understanding, and using these tables to calculate
probabilities for more than two categories. You should also be able to determine the probability
of combinations (for instance, P(F) = P(F & WW) + P(F & BM) + P(F & SM) = 0.310 + 0.115 +
0.075 = 0.5) and negations (such as P(F & WW'), meaning the probability that a surveyed person
is female and does not prefer Wonder Woman, as P(F & WW') = P(F & BM) + P(F & SM) =
0.115 + 0.075 = 0.190).
Practice:
1. Which of the following is categorical data?
(A) Height
(B) Weight
(C) Shoe size
(D) Hair color
2. The following table summarizes the hair color of a baseball team. What is the probability
that a player has brown hair?
(A) 0.48
(B) 0.16
(C) 0.32
(D) 0.03
3. The following table summarizes the hair color of a baseball team. What is the probability
that a player does not have gray hair?
(A) 0.48
(B) 0.97
(C) 0.16
(D) 0.25
4. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
What is the probability that a student gets an A on the first exam and a B on the second
exam?
(A) 0.067
(B) 0.1667
(C) 0.333
(D) 0.667
Math 1 Interpreting Categorical & Quantitative Data
Name:____________________
5. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
What is the probability that a student gets a C on the first exam and an A on the second
exam?
(A) 10%
(B) 16.67%
(C) 3.33%
(D) 50%
6. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
What is the probability that a student gets a B on the second exam?
(A) 0.5
(B) 0.667
(C) 0.333
(D) 0.25
7. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
What is the probability that a student gets a C on the first exam?
(A) 0.8333
(B) 0.50
(C) 0.25
(D) 0.1667
Math 1 Interpreting Categorical & Quantitative Data
Name:____________________
8. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
Given that a student earns an A on the first exam, what is the probability he or she earns a
C on the second exam?
(A) 3.33%
(B) 16.67%
(C) 20%
(D) 25%
9. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
Given that a student earns a B on the first exam, what is the probability he earns a B on
the second exam as well?
(A) 2 out of 3
(B) 1 out of 3
(C) 1 out of 2
(D) 7 out of 20
10. The following table summarizes the number of students in a class that received different
letter grades on 2 recent exams. The first exam is shown across the top and is
summarized as A1, B1, and C1, and the second exam is in the first column, A2, B2, and C2.
Given that a student earns a B on the first exam, what is the probability that she does not
earn a C on the second exam?
(A) 0.50
(B) 0.25
(C) 0.85
(D) 0.15
Download